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Preface 


I start this preface with some ideas of my former Teacher and 
Master, senior researcher I, corresponding member of the Romanian 
Academy, Dr. Doc. Nicolae Popescu (Institute of Mathematics of the 
Romanian Academy). 

Question: What is Mathematics? 

Answer: It is the art of reasoning, thinking or making judgements. 
It is difficult to say more, because we are not able to exactly define the 
notion of a "table", not to say Math! In the greek language "mathema" 
means "knowledge". Do you think that there is somebody who is able 
to define this last notion? And so on... Let us do Math, let us apply 
or teach it and let us stop to search for a definition of it! 

Q: Is Math like Music? 

A: Since any human activity involves more or less need of reasoning, 
Mathematics is more connected with our everyday life then all the other 
arts. Moreover, any description of the natural or social phenomena use 
mathematical tools. 

Q: What kind of Mathematics is useful for an engineer? 

A: Firstly, the basic Analysis, because this one is the best tool 
for strengthening the ability of making correct judgements and of tak- 
ing appropriate decisions. Formulas and notions of Analysis are at 
the basis of the particular language used by the engineering topics 
like Mechanics, Material Sciences, Elasticity, Concrete Sciences, etc. 
Secondly, Linear Algebra and Geometry develop the ability to work 
with vectors, with geometrical object, to understand some specific alge- 
braic structures and to use them for applying some numerical methods. 
Differential Equations, Calculus of Variations and Probability Theory 
have a direct impact in the scientific presentation of all the engineering 
applications. Computer Science cannot be taught without the basic 
knowledge of the above mathematical topics. Mathematics comes from 
reality and returns to it. 

Q: How can we learn Math such that this one not becomes abstract, 
annoying, difficult, etc.? 


6 PREFACE 


A: There is only one way. Try to clarify and understand everything, 
step by step, from the simplest notions up to the more complicated 
ones. Without gaps! Try to work with all the new notions, definitions, 
theorems, by looking at appropriate simple examples and by doing 
appropriate exercises. Do not learn by heart! This is the most useless 
thing you can do in trying to become a scientist, an engineer or an 
economist! Or anything else! 

Math becomes nice and easy to you if it is presented in a lively way 
and if you make some efforts to come closer and closer to it. If you 
hate it from the beginning, don’t say that it is difficult! 


The present course of Mathematical Analysis covers the Differential 
Calculus part only. 

It is assumed that students have the basic skills to compute simple 
limits, differentials and the integrals of some elementary functions. My 
teaching experience of almost 30 years at the Technical University of 
Civil Engineering Bucharest made me clear that the Math syllabus 
for engineering courses is not only a "part" from the syllabus of the 
faculties of mathematics. Engineering teaching should have at its basis 
very "concrete" facts. Mathematics for engineers should be very live. 
Student should realize that such type of Math came from "practice", 
returns to it and, what is most important, it helps a lot to make rational 
"models" for some specific phenomena. Besides this point of view, 
we have not to forget that the most important tool of an engineer, 
economist, etc. is his (her) power of reasoning. And this power of 
reasoning can be strengthened by mathematical training. 

My opinion is that some motivations and drawings are always very 
useful in the complicated process of making "easy" and "nice" the 
mathematical teaching. 

I consider that it is better to start with the notion of a real num- 
ber, which reflects a measurement. Then to consider sequences, series, 
functions, etc. 

In Chapter I tried to put together some notions and ideas which 
have more features in common. We end every chapter with some prob- 
lems and exercises. In some places you will find more detailed examples 
and worked problems, in others you will find fewer. At any moment I 
have in my mind a beginner student and not a moment a professional 
in Math. My last goal in this was "the art of teaching Math for engi- 
neers" and not "the art of solving sophisticated Math problems". We 
should be very careful that a good Math teaching means "not multa, 
sed multum" (C. F. Gauss, in Latin). Gauss wanted to say that the 
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quality is more important then the quantity, "not much and superfi- 
cial, but fewer and deep". We have computers which are able to supply 
us with formulas, with complicated and long computations but, up to 
now, they are not able to learn us the deep and the original creative 
work. They are useful for us, but the last decision is better to be ours. 
The deep "feeling" of an experienced engineer is as important as some 
long computations of a computer. If we consider a computer to be only 
a "tool" is OK. But, how to obtain this "feeling"? The answer is: a 
good background (including Math training) + practice + the capacity 
of doing things better and better. 

I tried to use as proofs for theorems, propositions, lemmas, etc. the 
most direct, simple and natural proofs that I know, such that the stu- 
dent be able to really understand what the statement wants to say. The 
mathematical "tricks" and the simplifications by using more abstract 
mathematical machinery are not so appropriate in teaching Math at 
least for the non mathematical community. This is why we (teachers) 
should think twice before accepting a new "shorter" way. My opinion 
is that student should begin with a particular case, with an example, 
in order to understand a more general situation. Even in the case of a 
definition you should search for examples and "counterexamples", you 
should work with them to become "a friend" of them... . 


I am grateful to many people who helped me directly or indirectly. 
The long discussions with some of my colleagues from the Department 
of Mathematics and Computer Sciences of the Technical University of 
Civil Engineering Bucharest enlightened me a lot. In particular, the 
teaching skill, the knowledge and the enthusiasm of Prof. Dr. Gavriil 
Paltineanu impressed and encouraged me in writing this course. He is 
always trying to really improve the way of Math Analysis teaching in 
our university and he helped me with many useful advices after reading 
this course. 

Many thanks go to Prof. Dr. Octav Olteanu (University Politehnica 
Bucharest) for many useful remarks on a previous version of this course. 

To be clear and to try to prove "everything" I learned from Prof. 
Dr. Mihai Voicu, who was previously teaching this course for many 
years. 

The friendly climate created around us by our departmental chiefs 
(Prof. Dr. ing. Nicoleta Radulescu, Prof. Dr. Gavriil Paltineanu, 
Prof. Dr. Romica Trandafir, etc.) had a great contribution to the 
natural development of this project. 

I thank to my assistant professor Marilena Jianu for many correc- 
tions made during the reading of this material. 
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A special thought goes to the late Dr. Ion Petrica who (many years 
ago) had the "feeling" that I could write a "popular" book of Math 
Analysis with the title "Analysis is easy, isn’t it?". 

The last, but not the least, I express my gratitude to my wife for 
helping me with drawings and for a lot of patience she had during my 
writing of this book. 

I will be very grateful to all the readers who will send me their re- 
marks on this course to the e-mail address: angel.popescu@gmail.com, 
in order to improve everything in future editions. 


Prof. Dr. Sever Angel Popescu 
Bucharest, January, 2009. 


CHAPTER 1 
The real line. 


1. The real line. Sequences of real numbers 


To measure is a basic human activity. To measure time, tempera- 
ture, velocity, etc., reduces to measure lengths of segments on a line. 
For this, we need a fixed point O on a straight line (d) and a "wit- 
ness" oriented segment [OA,] (A; # O), i.e. a unitary vector OA, (see 
Fig.1.1). Here, unitary means that always in our considerations the 
length of the segment [OA;] will be considered to have 1 meter. The 
pair (O, 7), where i = OA, is called a Cartesian (from the French 
mathematician R. Descartes, the father of the Analytical Geometry, 
what shortly means to study figures by means of numbers) coordinate 
system (or a frame of reference). We assume that the reader has a 
practical knowledge of the digits 0,1, 2,3, 4,5,6,7,8,9 which represent 
(in Fig.1.1) the points O, Aj, Ag,..., Ag. Let us now consider the point 


—> —> 
B on the line (d) such that the length AB of the vector AgB is 1 
—> — 
meter and B # Ag, i.e. AgB = OA, as FREE vectors. 


inverse OA3 = 3 OA; 
orientation ey 
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Fig. 1.1 


Our intention is to associate a sequence of digits to the point B. 
Here appears a first great idea of an anonymous inventor who denoted 
B by Ajo, this means one group of ten units (a unit is one OA,) and 0 
(nothing) from the next similar group. For instance, Ag, is the point 
on (d) which is between the points Ago and Azo such that it marks 6 
groups of ten units + 4 units from the 7-th group. Now Agg9 marks 
2 groups of hundreds + 6 groups of tens + 9 units, ... and so on. In 
this way we can represent on the real line (d) any quantity which is 
a multiple of a unity (for instance 130 km/h if the unity is 1 km/h). 
The idea of grouping in units, tens, hundreds, thousands, etc. supply 
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us with an addition law for the set of the so called "natural numbers": 
0,1, 2,...9,10,11,..., 99, 100, 101, .... We denote this last set by N. 
For instance, let us explain what happens in the following addition: 


3 i 


6 8 

9 7 
46 5 

First of all let us see what do we mean by 368. Here one has 3 groups 

of one hundred each + 6 groups of one ten each + 8 units (i.e. 8 times 


OA\). We explain now the result 465 (= 368 + 97) : 8 units + 7 units 
is equal to 15 units. This means 5 units and 1 group of ten units. This 
last 1 must be added to 6 + 9 and we get 16 groups of ten units each. 
Since 10 groups of 10 units means a group of 1 hundred, we must write 
6 for tens and add to 3 this last 1. So one gets 4 for hundreds. We say 
that a point A on the line (d) is "less" than the point B on the same 
line if the point B is on the right of A and not equal to it. Assume now 
that A is represented by the sequence of digits G,G;_1...do (ao units, a1 
tens, etc.) and B by the sequence bb_1...b0. Here we suppose that a, 
and 6,,, are distinct of 0 and that n > m. Otherwise, we change A and 
B between them. Think now at the way we defined these sequences! 
If n > m, A must be on the right of B or identical to it. If n > m 
then A is greater than B. If n = m, but a, > 6,, again A is greater 
than B. If n = m, an = bn, but an_1 < by_1, then B is greater than 
A. If n =m, ayn = bn, Gn_1 = bn_1, We Compare an_2 with b,_» and 
so on. If all the corresponding terms of the above sequences are equal 
one to each other (and n = m) we have that A is identical with B. If 
for Instance; 7:="M, G9 = 0g; Quer = Dads «4 Ge = dg, but Gp > DE] 
we must have A > B (A is greater than B). Here in fact we described 
what is called the "lexicographic order" in the set of finite sequences 
(define it!). If A > B one can subtract B from A as it follows in this 
example: 


(1.1) 


3.6 8 
9 7 
23767 3k 
This operation is as natural as the addition. Namely, 8 units minus 
7 units is 1 unit. Since we cannot subtract 9 tens from 6 tens, we 
"borrow" 1 hundred = 10 tents from 3. So, now 10 tens + 6 tens 
= 16 tens minus 9 tens is equal to 7 tens. It remains 2 hundreds from 
which we subtract 0 hundreds and obtain 2 hundreds. Instead of 10 
tens we write 10 x 10 = 10? units, etc. Thus, any natural number 


(1.2) 
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A = G,G,_1...Go (we identified here the name of the point with its 
corresponding sequence of digits) can be uniquely written as: 


(iz3) A =a + 10a; + 10°a2 +... + 10" Gp 


This is also called the representation of A in the base (of numeration) 
10. If instead of grouping units, tens, hundreds, etc., in groups of 10, 
we group them in groups of 2 for instance, we obtain the writing of 
same point A in base 2, etc. Why our ancestors chose 10, ... we do not 


Hence, the subtraction is not defined for any pair A, B. This means 
that A— B does not belong to N for any pair A, B. For instance, 3 — 4 
is not in N, but it is in Z! The algebraists say that N is a monoid 
and Z is a group (see any advanced Algebra course), relative to the 
addition. We can also introduce a multiplication in Z. First of all, if 


n,m are in N and both are not zero (otherwise we put n-m = 0), we 


define n-m 2 nm by n4+n+...+n, m times. For extending this 


operation to Z, we put by definition (—n)m = n(—m) = —(nm), for 
any pair n,m of N. The algebraists say that Z is a ring relative to the 
addition and this last defined multiplication (see the Algebra course). 
We use here freely the elementary basic properties of the addition and 
multiplication. For instance, 5-(7—9) =5-7-—5-9, because of the 
distributive property. 

We also have a dynamic interpretation of the set N. 0 is for O. 1 
is for the extremity A, of the vector OA. 2 is for the extremity of the 
vector OA; which is twice the vector OA, etc. We must remark that 
we just have chosen "an orientation" on the line (d), namely, we started 
our above construction "from O to the right", not "to the left". So, 
on (d) one has two orientations: the direct one, "to the right" and the 
inverse one, "to the left". If we construct everything again, "on the 
left" (by symmetry) we get the set of negative integers: —1, —2, —3.... 
. The whole set Z = {...,-—3, —2,—1,0,1,2,3,...} is called the set of 
integers. 

By "Arithmetic" we mean all the properties of N (or Z) derived from 
the "algebraic" operations of addition and multiplication. A prime 
number p is a natural number distinct of 1, which cannot be written as 
a product p = nm, where n and m are natural numbers, both distinct of 
1 (or of p). For instance, 2,3,5, 7,11, 13,17, ... are prime numbers. Any 
natural number n greater than 1 is either a prime number or it can be 
decomposed into a finite product of prime numbers (Euclid). Indeed, 
if n is not a prime number, there are n1, N2, natural numbers such that 
nN = NN, where n1,N2 <n. We go on with the same procedure for n; 
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and ng instead of n, etc., up to the moment when n = p,po2p3...pz, where 
all pi, p2, ..., Pp are prime numbers. Maybe some of them are equal one 
to the other so, we can write n = q""q5"...q;'", where q, q2,---; Gn are 


distinct primes. 


THEOREM 1. (The Fundamental Theorem of Arithmetic) Any nat- 
ural number n greater than 1 is either a prime number or it can be 
uniquely written as n = qy"qs?...g,", where qi, 2,+--, Gn are distinct 


prime numbers. 


All the other basic results in number theory are directly or indi- 
rectly connected with this main result. For instance, Euclid proved 
that the set of all prime numbers is infinite. Indeed, if it was not so, 
let 1, 92,---,9n be all the distinct primes. Then, let us consider the 
natural number m = qiqo...¢n + 1. It is either a prime number or it 
is divisible by a prime number p. Since qj, @2,.-.,¢gn are all the prime 
numbers, this p must be equal to a q; for a j € {1,2,...,N}. Then 1 is 
divisible by g;, a contradiction (Why?). Thus, our assumption is false, 
i. e. the set of prime numbers is infinite. The most delicate hypotheses 
and results in Mathematics are connected with this set. 

Recall that a function f : X — Y, where X and Y are arbitrary 
sets, is said to be injective (or one-to-one) if for any pair of distinct 
elements a and b from X, their images f(a) and f(b) are distinct in Y. 
f is surjective (or onto... Y) if any element y of Y is the image of an 
element x of X,i. e. y = f(x). Injective + surjective means bijective. 
If f is bijective we simply say that it is "a bijection" between the sets 
X and Y. Or that they have "the same cardinal". For instance, N and 
Z have the same cardinal because f : N — Z, f(0) = 0, f(2n) = —n 
and f(2n — 1) =n, for n = 1,2,... is a bijection (Why’). 

Generally, if a set A has the same cardinal with N we say that it 
is countable. If a set B has the same cardinal with a set of the form 
{1,2,...,n} we say that it is finite and that it has n elements, or that 
its cardinal is n. Why a set A cannot be finite and countable at the 
same time? 

Any countable set A can be represented like a sequence: ag = f (0), 
a, = f(1), a2 = f(2),... where f : N — A is a bijection between N and 
A (see the definition of countability!). Conversely, any set A which can 
be represented like a sequence is countable, i.e. it is the image of the 
natural number set N through a bijection f (prove this!). Hence, we 
define "a sequence" in a set A by a function g : N — A. Usually we 
denote g(n) by a, and write the sequence g aS do, Qj, G2, ...,n,--. OF 
simply as {a,,}, where a,, is said to be the general term of the sequence 
g. Here, for instance, as is called the term of rank 5 of the sequence g. 
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A sequence {b,,} is called a "subsequence" of the sequence {a,,} if there 
is a sequence ky < kp <... < ky < ... of natural numbers such that for 
any m EN, bd», is equal to a,,,. For instance {b, = 2k}, k = 0,1, 2,... 
is a subsequence of N = {0,1,2,...}. But the sequence {0, 1,2, 2, 2,...} 
is NOT a subsequence of N (Why?). Yes, the set {0, 1,2} IS a subset 
of N, but not ...a subsequence! Can N be a subsequence of Z? 

Now our question is: "How do we represent 2 kg and a quarter 
on the line (d)?" More exactly, to the point C’ on (d) which is the 
extremity of a vector OG , obtained by taking OA, twice + a quarter 
from the same vector OA, what kind of sequence of digits 0,1, 2, ...,9 
could we associate? Let us divide the segment [OA;] into 10 equal 
parts and let us associate the symbol 0.1 to the extremity Aji of 


the vector OAny which is the 10-th part of OA\. In the same way 
we construct Ajj), Ajj, -.-, Aj) and their corresponding symbols 0.2, 
0.3, ...,0.9. We continue by dividing the segment [O.Aj)| into 10 equal 
parts and obtain the new symbols 0.01, 0.02,....,0.09, etc. We say 
that 0.1 = a Q:01-= ao and so on. For instance, the sequence (or the 
number) 23.0145 represents the point EF on (d) obtained in the following 
= fare aS 5 ae 
way. To the vector OAg3 we add: p5OA1 + qpgOAi + qQ99 OA1- The 


resultant vector is OF , etc. If one works (by symmetry) on the left of 
O, one gets the "negative" numbers of the form: —G@;,G;,—1...9-01b2...bm, 
where a; and b; are digits from the set {0,1,2,....9}. This last number 
can be written as: 


: be. bs bes 
aa 1 ae A t .. 4 
(10" an + n—t 907° 79 © 102 io? 


An An—1.--€9b1b2...Dm 

1 
Here appeared fractions like ¢, where a and 6 are natural numbers 
and b # 0. We suppose that the reader is familiar with the operations of 
addition, subtraction, multiplication and division with such fractions. 
If a € Z and b = 10”, from this discussion, we have the geometrical 
meaning of the fraction ¢. We also call any fraction, a number. What 


(1.4) 


is the geometrical meaning of 2? Take again the vector OA, and di- 
=> 

vide. it into 7 equal parts. Let OG be the 7-th part of OA. Then 

40G = OH and H will be the point which corresponds to the number 

.. The Greeks said that the number : is obtained when we want to 


measure a segment [ON] with another segment [OM] and if we can find 
a third segment [OP] such that [ON] = 4[OP] and [OM] = 7[OP], ice. 
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ian = 3. A representation of a number (for instance a fraction) as + 
GnGn—1---d9-b1b2...bm... is called a decimal representation (or a decimal 


fraction). Let us try to find a decimal representation for the fraction 2. 


. . 4 1, 40 = ae BO 5 
The idea is to write 5 as 75:7. Then, 40 = 5-7+5 implies = = 5+ 3, 


where 2 < 1. Hence $= 4 + 4-2. Now we do the same for 2. Namely, 


; = 7 : 7 = i9 (7 73 80 
4 1 1 1 bas Be. oT. fab 
7 ph tC t= ot tet ie 7 
Write now 
ft 10: Alon we 
730° 7 7 9h +7): 
So 
Apc aol hs 3, Ps re ae ee rae Oe 
7 10° 102 ' 103 7’ 10 102 ' 103 ' 103 7 


Since the remainders obtained by dividing natural numbers by 7 can 
3 


be 0,1, 2,3,4,5, or 6, in the sequence . 2, 7 7, ++, at least one of the 
fraction must appear again after at most 7 steps. Thus, let us go on! 
Write 

3 1 30 1 2 


=—-: (4+ =). 
7 00. “F 4G 7 
So 
B= hae he oe ne, se 
7 10 102 10% 104 104 7 
But 
O - 2f DO: . 4 6 2 1 60 2 if 4 
=.= (+ 7)= (+ 75-5 =8+ 5642). 
7 10 7 ~~ «10 7 10 102? 7 «+10 10 7 
So 
A Oe kh , se 2 SB A 
7 10. 102 10% 104° 105 106° 106 7’ 
But 
4 1 4 1 5 
Se eh a). 
TO, % io! 7) 
Hence 
4 5 7 1 4 2 8 5 
(1.5) =—4 ! ee 


7 10 102 10% 104° 105 ' 108 107 


Since the digit 5 appears again, we must have: 


A no 
= = 0.5714285714285... ° 0.(571428). 
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We say that : is a simple periodical decimal fraction. Here we meet 
with an "infinite" sum, i.e. with a series: 


5 1 i 1 
(571428) = Ls} Feiwel cl Ls} ee eae 
a) i0! 10° ) 02! 10° ) 
5 7 1 4 2 8 1 1 


= 12] Patel: 
(79 102,108 ~— 104s: 108 08)! 10% > 10% ) 


But 1+ iw + aie +... is an infinite geometrical progression with the 
first term 1 and the ratio ut The actual mathematical meaning of this 
infinite sum will be explained later. 

The next question is if always one can measure a segment a by 
another segment b and obtain as a result a fraction 7. Even Greeks 
discovered in Antiquity that this operation is not always possible. For 
instance, if one wants to measure the diagonal d of a square with the 


side a of the same square we obtain a new number d such that (¢) 2 =D 
(apply Pythagoras’ Theorem). If d was a fraction “, where m,n € N, 
n # 0 and m,n have no common divisor except 1, then m? = 2n? 
and 2 would be a divisor of m, i.e. m = 2m’. Thus, 2m” = n? and 
then n would also have 2 as a divisor, a contradiction. Usually such 
a number d is denoted by \/2 because its square is 2. Such numbers 
were not accepted by Greeks as being "real" numbers ! But /2 can 
be represented on the real line (d). It is the point U which denotes the 
extremity of a vector OU such that its length is equal to the length of 
the diagonal of a square of side 1 (= the length of OA). Any fraction 
is called a rational number and any other number (like V2) is called 
an irrational number. V2 is an algebraic number because it is a root 
of an equation with rational coefficients (X? — 2 = 0). We say that 
a number is a real number if it is the result of a measurement, i.e. it 
can be associated with a point of the real line (d). Up to now we know 
that NOT all real numbers can be represented by ordinary fractions 
(like /2). We shall indicate below a natural way to associate to any 
point of the line (d) a decimal fraction, usually infinite. Recall that to 


the point A, (OA, = nOA) we associated a natural number n (given 
as a finite sequence of digits). The symmetric point of A,, relative to 
the origin O was denoted by A_,, (see Fig.1.1). Our intuition says that 
any point M belongs to a segment of the type [Ap, An41), where n here 
can be positive or nonpositive (i.e. n € Z). We want to associate to 
the point M its coordinate x,y i.e. a decimal number in the interval 
[n,n + 1) = the set of all the real numbers (known or unknown up to 
now!) which are greater or equal to n and less than n + 1 (relative 


to the above lexicographic order). So U An, An+1) = all the points of 
ne 
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(d). But this last assertion cannot be mathematically proved using only 
previous simpler results! It is called the Archimedes’ Axiom. In the 
language of the real numbers it says that any such number r belongs to 
an interval of the type [n,n +1). This n is called the integral part of r 
and it is denoted by |r]. For instance, [3.445] = 3, but [-3.445] = —4, 
because —3.445 € [—4,—3). So, our point M belongs to an interval 
of the type [A,, Anii) for ONLY one n = +4;G;_7...G9, where a; are 
digits. Let us divide the segment [A,,A,+1) into 10 equal parts by 9 
points B,, Bo,..., By, such that: 


[Ans An4+1) = [An us Bo, B,) U [Bi, Bz) U tee. U [Bo, Anti pe By). 


To these points we obviously associate the following rational numbers: 
By n+ 0.1, 


Bo — nN + 0.2, daxglQ — nN —++ 0.9. 


Since M € |A,, Anii), M belongs to one and only to one subsegment 
[B:, Biz1), where i € {0,1,...,9}. By definition we take as the first 
decimal of x,y, to be this last digit b} = 7. If M is just B; we have 
ty = tapap—1...do.b;. If M is on the right of B; the actual xy, will 
be greater then the rational number @;@,_—7...do-b; and we continue our 
above division process. Namely, instead of [A,, An41) we take [B;, Bj,1) 
that M belongs to and divide this last interval into 10 equal parts by 
the points Cp = B;, Cy,...,Cg and Cp = B;,,. There is only one 7 such 
that M € [C;,Cj41). By definition, the second decimal of x4 is by = j. 
If M = Cj, then xy = 4a,G,-7..-9-bib2 and xy would be a rational 
number. If NOT, then we go on with the segment [C;,C;41) instead of 
[B;, Biz1), etc. If at a moment M will be the left edge of an interval 
obtained like above, then x,y, will have a finite decimal representation, 
i.e. it will be a rational number. If M will never be in this situation, 
then x,y can or cannot be a rational number. For instance, the point 
P which corresponds to the fraction : is in this last position but, ... it 
is represented by a fraction, so xp is a rational number. The point V 
which corresponds to V2 is in the same position as P, but zy is not a 
rational number as we proved above. The segments constructed above, 
are contained one into the other: 


[An, Anti) D [Bi, Bist) D [Cj, Cj41) D ... 
If M is not the left edge of no one of these segments, then their inter- 
section is exactly M (Why’). 


In general, the following question arises. If one has a tower of closed 
segments 


[Pig lly| [Toy Us|, D2 DT Uy) Dex. 
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on the real line (d), their intersection is empty or not? Our intuition 
says that it could not be empty for ever! But,... there is no mathemat- 
ical proof for this! This is way this last assertion is an axiom, called 
the Cantor’s Axiom. Now we can call a real number r any decimal 
fraction (finite or not) of the type: 


(1.6) , =] OpOpaarg dU vannUmye 
We can write this "number" as a sum of some special type of fractions 
bi by bn 
1.7) r=+([10*a, +...+ 10a; + a9 + — 4 t.. 4 — 
ee ( ie OS To AO 10" ) 


Using this last representation, it is not difficult to define the usual 
elementary operations of addition, subtraction, multiplication, and di- 
vision for the set R of all the real numbers (do it and find a natural 
explanation for the rules you learned in the high school!-You must also 
use the fact that r = lim r,,, where 


m—-oco 


ees (10% +... + 10a; + ao 4 “ | oe bea wa) 

and the usual operations with convergent sequences). The algebraists 
say that R together with the addition and multiplication is a field (see 
the exact definition of a field in any Algebra course and verify this last 
assertion!). Because of the fact that the real numbers are nothing else 
than a representation of the points of the real line (together with a 
Cartesian reference frame on it!), the Archimedes’s and the Cantor’s 
axioms work on R. They can be expressed in the following way (in 
language of numbers...): 


AXxI10M 1. (Archimedes’s Aziom) For any real number r there is 
one and only one integer number n such thatn <r<n+1. 


AXIOM 2. (Cantor’s Axiom) Let a, < ag <,...,< Qn <,... and 
b, > bo >,..., > by >, ... be two sequences of real numbers such that for 
any n one has that a, < b,. Then there is at least one real number r 
between a, and b, for anyn EN. If in addition, the difference by, — an 
becomes smaller and smaller to zero, whenever n becomes larger and 
larger, then this real number r is unique (in fact, this last assertion is 
not an axiom !). 


Hence, the real numbers can always be seen like points on a real 
line (d). If we change the line and (or) the Cartesian reference frame we 
clearly obtain different sets of real numbers. But,...all these fields of real 
numbers are isomorphic like ordered fields. This means that for any 
two such fields R; and Rp there is at least one bijection f : R; — Ro 
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such that f(x + y) = f(x) + fly), Fey) = flx)fly) (Ff preserves 
the algebraic structure of fields) and f(x) < f(y), whenever x7 < y 
(f preserves the order introduced above). Here x,y € Ry. In fact, it 
is not difficult to construct such a bijection. If we take x € Rj, it 
is the decimal representation of a point X on the first real line (d;). 
But always one can construct a natural bijection g between the points 
of (d,) and the points of (dz) which carries the Cartesian coordinate 
system of the first line into the coordinate system of the second line. 
Now we take for f(x) the real number which corresponds to the point 
g(X) of the second line (prove that this construction works). 

From now on we fix a field R of real numbers and we assume that 
the reader knows the usual elementary rules of operating in this R. It is 
of a great benefit if one always think of a real number as being a point 


on a fixed real line (d). So, ... draw everything or almost everything! 
This is why we say a point instead of a number and a number instead 
of a point! 


We realize that the "practical" representation of an irrational num- 
ber on the real line (d) is impossible! This means that you will never 
find a finite algorithm to do this. Because the point on (d) which cor- 
responds to such an irrational number is obtained as the intersection 
of an infinite number of closed intervals, each of them contained into 
another one. Since the length of these intervals becomes smaller and 
smaller up to zero, practically we can approximate the real position of 
that point by one of the two ends of such a "very small" interval. 

We must remark that the correspondence between the points of the 
real line (d) and the decimal representations is not a bijection. For 


instance, 0.999... = 1. But,... the correspondence between the points 
of the real line (d) and the real numbers is a bijection! (Descartes’ 
bijection). 


Let us come back and recall that the set of natural numbers 


N = {0,1,...,9,10, 11,...,20, 21,...,n,...} 
can be naturally embedded in the ring of integers 
Z = {0,1,—1,2, —2,...,n, —n,...}, 


where n is a natural number. This embedding preserves the usual 
operations of addition and multiplication. Both sets N and Z are clearly 
countable because they are naturally represented like sequences. What 
is the difference between N and Z? The equation X — 3 = 0 has a 
solution in N, x = 3, whereas the equation X + 3 = 0 has NO solution 
in N, but it has the solution x = —3 in Z. The next step is to see that 
the general linear equation of the form aX + b = 0, where a,b € Z, 
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may have no solution in Z. For instance, 2X +1 = 0 has no solution in 
Z, but its solution is the fraction = = —t which is a rational number. 
Let us denote by Q the field of rational numbers and see that any 
integer number m can be represented as a rational number: m = 7}. 
So, NC ZC QC R, since any rational number is a particular real 


number by the definition of a real number. 


THEOREM 2. The rational number field Q is also a countable set. 


PROOF. It will be enough to represent the positive elements of Q as 
a subsequence of a sequence (Why?-Use the same trick like in the case 
of the countability of Z). Look now carefully to the following infinite 
table 


1 1 1 1 1 1 1 1 

t. ae eae ae 
cA eo va va o- ~ we 

2 2 2 2 2 2 2 2 

1 2 3 4 5 6 ci 8 

ewe x 7 x ie i 

3 3 3 3 3 3 3 3 

1 2 3 4 5 6 7 8 
wi a ee Y we 

4 4 4 4 4 4 4 4 

1 2 3 4 5 6 ¥ 8 

a x a Va 

5 5 5 5 5 5 5 5 

1 2 3 4 5 6 vi 8 
A Via age 

6 6 6 6 6 6 6 6 

1 2 3 4 5 6 7 8 

Le ae 

is 7 7 ¢ i 7 is 7 

1 2 3 4 5 6 7 8 
we 

8 8 8 8 8 8 8 8 

1 2 3 4 5 6 7 8 

| 


and to the arrows which indicate "the next term" in the sequence. 
This sequence covers ALL the entries of this table and any positive 
rational number is an element of this sequence, i.e. Q, can be viewed 
as a subsequence of this last sequence. Thus Q. is countable. Since 
Q=Q_U {0} UQ,, Q is also countable. 


Recall that a real number r is a "disjoint union" of two sequences 
of digits with + or — in front of it: 


(1.8) P= 2 OPE, Op Vivo Og ox 
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The first sequence is always finite: a,,@,_1,...,@0. After its last digit 
ag (the units digit) we put a point ”.” . Then we continue with the 
digits of the second sequence: 0j, bg,...,bn,... . AS we saw above, this 
last sequence can be infinite. If this last sequence is finite, i.e. if from 
a moment on bj41 = bnig = ... = 0, we say that r is a simple rational 
number. Any simple rational number is a fraction of the form 77 
where a € Zand n EN. If r is not a simple rational number, it can be 
canonically approximated by the simple rational numbers 


Tr = G,G,_1.--9-b1b9...by, 


for n = 1,2,.... This means that when n becomes larger and larger, the 
absolute value 


(1.9) 

1 bn 2 Dn, 3 
error, = |r —T,| = 0. wo Die yous S ion (bn4i4 a a +...) 
becomes closer and closer to 0. Indeed, 

1 On42 , On+3 1 9 9 1 
boi t Fi) < 94 +...) = —— 
ort Ones + Tg” + get) S Gp + 79 + Tg t) = ape 
and, since 74; < + (prove it!), one gets that |r —r,| — 0 (tends to 0), 


when n — co (the values of n become larger and larger). 


REMARK 1. Hence, in any interval (a,b), a 4 b, a,b real numbers, 
one can find an infinite numbers of simple rational numbers (prove it!). 


But, what is the mathematical model for the fact that a sequence 
{tp}, n =0,1,... tends to 0 (ie. |x| becomes closer and closer to 0, 
when n becomes larger and larger (n — oo))? 


DEFINITION 1. We say that a sequence {x,},n = 0,1,... is conver- 
gent to 0 (or tends to 0), when n tends to co (n > ov), if for any posi- 
tive (small) real number € > 0, there is a natural number N; (depending 
one) such that |x,| < ¢ for anyn > N.. We simply write this: x, — 0, 
or, more formally: tim Ln = 0, or, less formally: limx, = 0. We also 


say that a sequence {x,}, n = 1,2,... is convergent to a real number 
x (or that x is the limit of {rn}; write limz, = x) if the difference 


sequence {z, — x}, n= 1,2,... is convergent to 0, or, if the "distance" 
|v, —2| between x, and x becomes smaller and smaller as n — oo. 
This is equivalent to saying that for any positive (small) real number e, 
all the terms of the sequence {x,},n = 0,1,..., except a finite number 
of them, belong to the open interval (x — ¢,x + €). Such an interval, 
centered at x and of "radius €", is called an €-neighborhood of x. 
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THEOREM 3. Let {x} be a convergent sequence. Then its limit is 
a unique real number. 


PROOF. Let us assume that x and 2’ are two distinct limits of the 
sequence {x,} and let ¢ be a positive small real number such that 
é < |x —2’|. Since both x and 2’ are limits of the sequence {z,,}, for 
n large enough, one must have |x, — 2| < | and |x’ — x,,| < 5. Now 

: ; ; E € € 
é<|a’—a2| = |2’ -—2t, +2, —-2| < |x’ —2,|4+ |rn eg Sy 
or € < 5, acontradiction! So, any two limits of the sequence {,, } must 
be equal! 


In (1.9) we have in fact that any real number r can be approximated 
by its simple rational number components (or approximates) r,,, i.e. 
limr, =r. We say that the set of simple rational numbers is dense in 
R. In particular, Q is dense in R. Let m be a fixed nonzero natural 
number and let Q;, be the set of fractions of the form —>, where a runs 
in Z and n runs in N. Then any real number r is a limit of elements 
from Qn, i.e. Qm is dense in R (prove it!-write r in the basis m, instead 
of 10). 

We just used above that the sequence {+}, n = 1,2, ... is convergent 


to 0. Our intuition says that if we divide the unity vector OA, (see 
Fig.1.1) into n equal parts, the length 4 of one of them becomes smaller 
and smaller. But,...why? What is the mathematical explanation for 
this? 

THEOREM 4. The sequence {+} is convergent to 0. 

PROOF. We apply Definition 1. Let ¢ > 0 be a small positive real 
number and, by using the Archimedes’s Axiom, let N. be the unique 


natural number such that : € [N, —1,N,). So, for any n > N,, one 
has that t < Nz <n, ie. 4 <€. 


REMARK 2. The absolute value or the modulus |r| of the real number 
r from (1.8) is simply 


Gp Ap—1--.A9-01b9...Dn..., 


i.e. r without minus if it has one. For instance, |—3.14| = 3.14 = 
|3.14|. Since the function dist, which associates to any pair of real 
number (x,y) the nonnegative real number |x — y|, t.e. dist(x,y) = 
|x —y|, has the following basic properties (prove them!): 

i) dist(x,y) = 0, if and only if x = y, 

ti) dist(x,y) = dist(y, 2), 

iii) dist(a, y) < dist(x, z) + dist(z,y) (the triangle inequality), 
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for any x,y,z in R, we say that dist(x,y) = |x — y| is the distance 
between x and y and that R together with this distance function dist is 
a metric space. 

Another example of a metric space is the Cartesian plane xOy 
with the distance function between two points M,(21,y1) and M(x, y2) 
given by the formula: 


dist(M,, Mz) = MM 


= V (x2 = 01)? + Wo =), 


i.e. the length of the segment |My Mb]. Here we can see why the property 
iii) was called "the triangle property" (be conscious of this by drawing 
a triangle in plane...!). 


Now, what is the difference between the rational number field Q and 
the real number field R? The first one is that Q is countable and, as 
the following result says, R is not countable, so the subset of irrational 
numbers is "greater" than the subset of rational numbers. 


THEOREM 5. (Cantor’s Theorem). The set R is not countable, 
i.e. one can NEVER represent the whole set of the real numbers as a 
sequence. 


PROOF. Let r be like in (1.8). It is enough to prove that the set S$ 
of all the sequences {bj, ba, ..., bn, ...}, where b, is a digit, is not count- 
able. Suppose on the contrary, namely that S can be represented like 
a sequence of ... sequences: S' = { Bj, Bo,..., Bn,...$, where 


By, = {Oni; bn2; bn3, Bene Orns aps 


and b,; are digits. In order to obtain a contradiction, it is enough 
to construct a new sequence of digits, which is distinct of any B; for 
t= 1,2,.... Let C = {c1,¢9,...,Cn,...} with the following property: 
Cn = bran +1, if ban #9 and c, = 0, if by, = 9. Now, let us see that C' is 
not in S. Assume that C = B, forak € {1,2,...}. By the definition of 
cr, this last one cannot be equal to bgx, thus the k-th term of C’ is not 
equal to the k-th term of B, and so, C # Bx, a contradiction! Hence 
C €¢ S. So S cannot be represented like a sequence. 


It is not difficult to prove that the subset of IR which consists of all 
the algebraic elements over Q (roots of polynomials with coefficients in 
Q) is countable. So, R contains an uncountable subset of transcendental 
numbers (numbers which are not algebraic). In fact we know very 


few of them, e, 7, eY2, etc. A real number which is not rational is 
called an irrational number. Since any interval (a,b) is in a one-to- 
one correspondence onto the interval (0,1) (f : (0,1) — (a,6), f(t) = 
a +(b—a)t is a bijection between (0,1) and (a,b)) and since tan : 
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(—5, 5) — Ris a bijection between (—5, 5) and R, there is a bijection 


between R and any nontrivial interval (a,b), does not matter as small 
as this last interval is. 


REMARK 3. Hence, (a,b) with a 4 b is not countable. Thus, in 
(a,b) one can find an infinite number of irrational numbers and even 
an infinite number of transcendental numbers (why?-explain step by 
step!). 


Can we solve any equation in R ? The answer is no! Even the 
simple equation X* +1 = 0, with the coefficients in Z has no real 
solution. Why? Because x = 0 is not a solution and, if x 4 0, then x? 
is positive (see the multiplication rule of signs!). So, x? + 1 is greater 
than 1, thus it cannot be zero. In order to solve this last equation we 
need to enlarge R up to another field C, the complex number field. 
Its algebraic structure is the following. Take the 2-dimensional real 
vector space V = Rx R with the componentwise addition and the 
componentwise scalar multiplication. Then we introduce a "strange" 
multiplication: 


(1.10) (a, b)(c, d) a (ac — bd, ad + bc). 

It is not difficult to prove that V together with this multiplication 
becomes a field in which (0,1)? = (—1,0), identified with the real 
number —1, because a — (a,0) is a canonical embedding of R into 
V. This new field is usually denoted by C. It is clear that +(0,1) are 
the solutions of the equation X? + 1 = 0. What is amazing is that C. 
F. Gauss proved that any polynomial with coefficients in C has all its 
roots in C. The algebraists say that C is algebraically closed (it cannot 
be enlarged by adding to it new roots of polynomials with coefficients 
in it). Later, Frobenius proved that there is no other superfield of R, 
which has a finite dimension over it, but C (which has dimension 2 
over R). Here dimension means the dimension of C as a vector space 
over R. Since any z = a + ib, where 7 = (0,1) and a,b are unique real 
numbers, {(1,0), (0,1)} is a basis in C. So the dimension of C over R 
is 2. 

Let us now come back to our problem relative to the differences 
between Q and R. Since Q is a subfield of R, the Archimedes Axiom 
also works on Q. But, what about Cantor’s Axiom? We know that 
V2 is not in Q. Let us consider the (infinite) decimal representation of 


V2: 
(1.11) V2 = 1.Albgby...by..- 
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and let us denote by x, = 1.41b3b4...b,, the corresponding n-th simple 
rational number of V2. It is clear that the sequence {x,,} is an increas- 
ing sequence which converges to /2. Let us also consider the following 
decreasing sequence {y,} of simple rational numbers, convergent to 
the same V2. y; = 1.5, yo = 1.42,..., Yn = 1.41b3b4...0n—1Cnonsions2---; 
where c, = b, + 1, if 6, #9 and c, = b, = 9, if b, = 9. It is easy to 
see that the intersection of all the closed intervals [7 yn], 2 = 1, 2,..., 
in Q, is empty in Q (since the intersection in R is exactly /2, which is 
not in Q). Hence the Cantor axiom does not work for the ordered field 
Q. 

In this last counterexample we needed some tricks, so it will be 
desirable to have an equivalent statement to the Cantor’s Axiom. For 
this we introduce two important new notions, namely the notion of the 
least upper bound (LUB) and the notion of the greatest lower bound 
(GLB) of a given subset of R. We do everything for the LUB and we 
leave to the reader to translate all of these in the case of the GLB. 

Let A be a nonempty subset in R. A real number z is called an 
upper bound for A if any element a of A is less or equal to z. A least 
upper bound (LUB) for A is (if it does exist!) the least possible z which 
is an upper bound for A. For instance, the LUB of A = [0,7) is 7 and 
the GLB of A is 0. We cannot have two distinct LUB for the same 
subset A (Why?). If A is (upper) unbounded (i.e. if for any natural 
number n there is at least one element b of A such that b > n), then A 
has no upper bound in R and as a logical consequence it has no LUB 
in R. For instance, A = [0,0o) has no upper bound in R, but 0 is the 
GLB of A. R and Z have neither an LUB nor a GLB in R. 

Usually, the LUB of a subset A is denoted by sup A (the supremum 
of A) and the GLB of a subset B is denoted by inf B (infimum of B). 


THEOREM 6. (LUB test) Let A be a subset of R. Thenc is the LUB 
of A if and only if for any small positive real number ¢ > 0, there are 
an element a of A such thatc—¢ <a<c and an upper bound z of A 
withe <z<cte. This is equivalent to saying that any ¢-neighborhood 
of c must simultaneously contain an element a of A and an upper bound 


z of A (Why?). 


PROOF. Let us suppose that c = sup A. Assume that we found an 
€ > 0 such that all the elements of A are less or equal to c — €. So 
c — € is an upper bound of A less than c, a contradiction, because, by 
definition, c is the least upper bound of A. Hence, there is at least one 
a € A in the interval (c—e, c]. If all the upper bounds of A were greater 
or equal to c+, then c would not be the least upper bound of A and 
we would obtain again a contradiction. 
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Conversely, let us assume that c is a real number with the property 
described in the statement of the above theorem. If c were not sup A, 
we have two options: 1) c is not an upper bound of A, i.e. there is 
at least one a greater than c. Taking now ¢ = a —c and using our 
hypothesis for this particular « > 0, we get an upper bound z of A in 
the interval [c,c + ¢ =a), i.e. z is less than a. This is in contradiction 
with the fact that z is an upper bound of A. Hence 1) cannot appear. 
It remains only the second option: 2) c is an upper bound of A, but 
it is not the least, namely there is another upper bound y which is 
less than c. Take now ¢ = c— y > 0 and use again the hypothesis of 
the theorem for this new ¢. So, one can find an element b of A in the 
interval (c—e = y,c]. Thus, b is greater than y, which was considered to 
be an upper bound of A. Again a contradiction! Therefore, the second 
option is also impossible and the proof is complete. 


The LUB test is very useful because it supply us with some impor- 
tant results. 


THEOREM 7. The following statements are logically equivalent: i) 
The Cantor Axiom (see Axiom 2) works in R, ii) Any upper bounded 
subset A of R has a LUB in R and, iti) Any lower bounded subset B 
of R has a GLB in R. 


PROOF. First of all let us see that ii) and iii) are equivalent. Let 
us prove for instance that ii)= iii). For the lower bounded subset B 
of R let us put —B = {x € R: —az € B}, the symmetric subset of B 
with respect to the origin O (on the real line (d)). It is not difficult to 
see that the new subset —B is upper bounded in R and so, from ii) it 
has a LUB b in R. We leave the reader (eventually using Theorem 6) 
to prove that —b is the GLB of B in R. 

We leave as an exercise for the reader to prove that iii)—=> i). 

Now we prove that i)==> ii). Let bp be an upper bound of A and 
let dg be an element of A. It is clear that ag < bo. If a9 = bp we have 
nothing more to prove because the LUB of A will be this common value 
C = do = bg. Assume that ag is less than bp an let us divide the closed 
interval [ao, bo| into two equal closed subintervals by the mid point cp. 
By the "essential choice" we mean to choose the subinterval |ao, co] if co 
is an upper bound for A, or to choose the subinterval [co, bo] if there is 
at least one element a € A in the second subinterval, [co, bo|. After we 
have performed "the essential choice", let us denote by |a1, bi] either 
the subinterval [ao, co] in the first choice, or the subinterval [co, bo] in 
the case of the second choice. In both situations a, € A, b; is an upper 
bound of A and ap < a, < b; < bo. Now we take the interval |ay, bi], 
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divide it into two equal parts and repeat the "essential choice" for this 
new interval [aj, 6], find a2 € A and bz an upper bound of A with 


do < ay < ag < by < dy < bo 


and so on. We obtain two sequences: an increasing one and a decreasing 
one in the following position: 


Ges Qi SS Gy Se Sy a SO Sos 


dist(ao,bo) 
gn 


such that the distance dist(an, by) = . In particular, 


dist(adn, bn) > 0, 

whenever n — oo. Now we can apply the Cantor Axiom and find a 
unique point c belonging to all the intervals [a,,, b,,| for any n = 1, 2,..., 
i. e. lima, = limb, = c (Why?). We prove now that this c is exactly 
sup A. Let us now apply the LUB test (see Theorem 6). Take an ¢ and 
let us consider the e-neighborhood (c—¢,c+¢). Since lima, = limb, = 
c, there is an n € {1,2,...} such that [an, b,] C (ec — e,c+¢). But, by 
the above construction, a, € A and b, is an upper bound of A. So, by 
the criterion of Theorem 6, we get that c= sup A. 

ii)— > i) Let {a,} and {b,,} be two sequences of real numbers such 
that 

do Say Su. San So Sn So < DY < Do. 


The subset A = {do, @1,...,@n, ...} is upper bounded in R by any term of 
the second sequence {b,,}. From ii) we have that A has a LUB c = sup A 
and c < b, for any n = 0,1,.... Since ¢ is in particular an upper bound 
of A, one also has that a, <c < b, for any n = 0,1,... . Hence the 
Cantor Axiom works on R. 


A sequence is said to be monotonous if it is either an increasing or 
a decreasing sequence. For instance, x, = and yn, = — are 
monotonous sequences. 


ae pelle 
n2+1 n241 

REMARK 4. Let us now introduce two symbols: 1) co, which is 
considered to be greater than any real number r, r +00 = 00, C0 +00 = 
oo, and 2) —oo, which is considered to be less then any real number r, 
r+(—oo) = —0o, —00 — (00) = —00, r-c0 = 0, ifr >0,7r- co = —00, 
if r < 0. Moreover, r-(—oo) = —co ifr > 0 andr-(—co) = ~, ifr is 
negative. In the same logic, 


00:00 = (—00)-(—00) = 00, (—00)-00 = —00 = 00-(—0o), = 0, etc. 


i 
+00 
The operations 0-(+00), 00-00, and ~ are not permitted. We denote 
by R = {—oo} URU {co} and call it the accomplished (or completed) 
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real line. By definition, a neighborhood of oo is an open interval of 
the form (M, co) and a neighborhood of —co is an interval of the form 
(—oo, L), where M,L are real numbers. For instance, in R any subset 
of real numbers is bounded (upper or lower) and an unbounded (in 
R) increasing sequence is said to be "convergent to oo” (for example, 
In = n*® — 00). But the sequence yn = (—1)"n is bounded in R but it is 
not "convergent" there (Why?). Usually, if a sequence of real numbers 
is "convergent to oo” in R, we say that it is divergent in R. Sometimes, 
by abuse, we write lim x, = co when the sequence {x,} is unbouded 


and increasing. If {x,} is a sequence in R and if L({x,}) is the set of 
all the limits of all the convergent subsequences of {x,}, we denote by 
lim sup{z,}, the sup L({z,}) and by liminf{z,}, the inf L({z,,}). For 
instance, for the sequence x, = sin(#4*m) = (—1)", limsupa, = 1 
and liminf x, = —1 (prove this!). 


THEOREM 8. a) Let {x,} be an increasing sequence in R. Then 
limsup z,, exist in R and the sequence is convergent to limsup 7,, in R. 
If {x,} is also upper bounded in R, then limsup2,, is its limit in R 
too, i.e. lima, = limsupz,. b) Let {yn} be a decreasing sequence in 
R. Then liminf x, always exist in R and the sequence is convergent to 
liminfz, in R. If {x} is also lower bounded in R, then liminf x, is 
also in R and so limx, = limsup 2p. 


PROOF. We prove only a) and we think that b) is a good exercise 
for the reader. If {,} is upper unbounded then, for any real number 
M, there is at least one n with x, > M. Since {x,} is an increasing 
sequence, p14) > ©, for any p= 1,2... . So, outside the neighborhood 
(IM, co) of co we have only a finite number of terms of our sequence, 
i.e. L, — 00, which is at the same time limsupz, (Why?). If {z,} is 
upper bounded, then, using Theorem 7, we get that c = limsup x, is a 
real number. Take now an ¢-neighborhood (c — ¢,c+ 6) of c. Since c is 
the LUB of the set {z,}, we can apply Theorem 6 and find an z,, in the 
interval (c — €,c|. Since the sequence is increasing, Um+41, 2m+42,--. are 
in the same interval (Why?). So, outside this interval one has at most 
a finite number of terms of our sequence, i.e. x, — c (see Definition 
1). 


Let us come back to the approximation of 2 = 1.41b3b4...0n... 
(see (1.11)) by the increasing sequence x, = 1.41b3b4...b,, n = 1,2,... 
of simple rational numbers. This last sequence {z,,} is a sequence 
in Q but its limit /2 is not in Q. However, this sequence has an 
interesting property. If we fix an n € N, and if we consider the terms 
Ln, Ln41;In42,+-Ln4+p, we see that the distance between z, and Xnip 
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goes to 0 independently of p € N, but dependently of n. This means 
that from a rank N on the distance dist(x7, 2) becomes smaller and 
smaller (l,m > N). Indeed, 


. 1 
dist tas Pyty) = 0. 00...0 Dai Ona oesDitg < 0. 00...0 999... = 10" — 0 
n—times n—times 


independently on p, i.e. for any small real number ¢ > 0, there is a 
rank N. such that whenever n > N- one has that dist(%,,%n4p) < €, 
for any p= 124 


DEFINITION 2. Let {x,} be a sequence of real numbers. We say 
that {x,} is a Cauchy sequence or a fundamental sequence if for any 
small positive real number ¢ > 0. there is a rank N, (depending on <) 
such that |tnip — In| < € for anyn > N- and for any p = 1,2,.... This 
means that |%n+p — Ln| > 0, when n — ov, independently on p. 


For instance, the above sequence x, = 1.41b3b4...b,, n = 1, 2,... is 
a Cauchy sequence of rational numbers which is not convergent in Q, 
but which is convergent in R, its limit being the real number V2. This 
is why we say that Q is not "complete". 


DEFINITION 3. In general, a metric space X with its distance dist 
(see Remark 2) is said to be complete if any Cauchy sequence {x,,} with 
terms in X is convergent to a limit x of X. 


Let us consider the following sequence 


cos 1 cos 2 cos3 _ cosn 


Lyn = 1 I Tove TF ; 
2 ye pass 2” 
where the arcs are measured in radians. Let us prove that this last 


sequence is a Cauchy sequence. For this, let us evaluate the distance 


dist (tA; Cain) = Crip Ca\ = 


cos(n+1) _ cos(n+2) , cos(n+p) 
gnt+1 Qn+2 Po read Qn+p 
1 1 1 1 
< ee Pana) ee 
arr y) 92 ) Qn 


This last equality comes from the definition of the infinite geomet- 
rical progression 
to a if L.. 7d 1 ee See 
Ltt t Stim (14534 pont ge) = im EF = 2 


2 n— oo n—- oo 1 = 
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So dist(%n,Xn+p) tends to 0 independently of p, because + goes to 
0, whenever n — oo, independently of p. Indeed, for a small ¢ > 0, let 
us find the first natural number N-, such that oo < ce. Applying log, 


we get N. > —logye, so Nz = |—log,e] + 1. Now, ifn > N., 
1 1 
dist (Pp Lady) < an < QNe <€, 


independently on p. 


THEOREM 9. Any convergent sequence {x,} to x is also a Cauchy 
sequence. Thus, the class of Cauchy sequences "appears" to be larger 
then the class of convergent sequences. 


PROOF. Wesimply verify Definition 2. Let ¢ be a positive small real 
number and let N- be a rank (dependent on ¢) such that |x, — x| < § 
for any n > N- (see Definition 1 with 5 instead of ¢). So, 


lZntp — En| = |Sntp —L +2 —IZn| < |enip —2/+|2n —2| << =~ += =e 


for any n > N-. Hence our convergent sequence is also a Cauchy se- 
quence. 


A basic result in Mathematics was discovered by Cauchy: "Any 
fundamental sequence of real numbers is convergent to a real number, 
i.e. R is a "complete metric space". 

To prove this important result we need some specific properties of 
the Cauchy sequences. 


THEOREM 10. Any Cauchy sequence {x,,} is bounded, i.e. there is 
a positive real number M such that |x,| < M for any n = 0,1,... or, 
equivalently, if there is an interval |A, B] in R such that all the terms 
of the sequence {x,} belong to this interval, i.e. x, € [A,B] for any 
n=0,1,... (Why this equivalence ?). 


PROOF. Take an arbitrary positive real number, for instance 2. 
Since {2,,} is a Cauchy sequence, there is a rank N such that whenever 
n > N, |fntp—2n| < 2 for any p = 1,2... (see Definition 2). In 
particular, |tvip — Xn| < 2, or tnN4p € (UN — 2,2N +2) for any p EN. 
So, outside this last interval one may have at most 2, 71,...,UnN_—1 as 
terms of our sequence. Take now A = min{xo, 71,...,UN-1,0N — 2} 
and B = max{%o, %1,...,.Un-1,¢n + 2}. It is easy to see that all the 
terms of the sequence {z,,} belong to the interval [A, B]. If one takes 
now M = max{|A|,|B|}, then z, € [—M, M], or |z,| < M for any 
= Oa 


Here is a strange property of the Cauchy sequences. 
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THEOREM 11. If a Cauchy sequence {x,,} contains at least one sub- 
sequence {X%,,}, (ko < ky < ka <2. < kin <... ) which is convergent to 
x, then the whole sequence {x,,} is convergent to the same x. Therefore, 
all the other subsequences of {x,} are convergent to x. 


PROOF. Let € be a small positive real number. Since {x,,, } is con- 
vergent to to x whenever n — oo, for n large enough, let us assume 
that for n > N’, one has 

E 
(1.12) |z., — 2| < 5" 
Since {z,,} is a Cauchy sequence, for n large enough, suppose n > N”, 
one has that 


E 
(1.13) aaa aa %| < 9? 
for any p = 1,2,.... Let now N be a natural number greater than 


N’ and than N”, at the same time. Let n be a fixed natural number 
greater than N and let us choose k,,, such that it is greater than this 
fixed n and m itself is greater than N. So, k,, = n+ p, for a natural 
number p (= k, — n). From (1.13) we get that 


(1.14) [Siem — En| < = 
because n > N > N”. From (1.12) one has that 
(1.15) lz, —2| < _ 


because m > N > N’. Now, 


Ee € 
|fn — L| = |Ln — Liem + Lim — Z| < |Lim — Ln| +|Len — Z| < ra ames 
And this is true for any n > N. Hence, the sequence {z,,} is convergent 
to x. We leave to the reader to convince himself (or herself) that if a 
sequence {2,,} is convergent to a real number x, then any subsequence 
of it is also convergent to the same 2. 


We prove now a basic property of a bounded infinite subset A of 
real numbers. For this we give a definition. 


DEFINITION 4. We say that a subset A of real numbers has the 
point (real number) x as a limit point if there is a sequence {a,,}, with 
distinct terms a, from A, which is convergent to x. 


For instance, 0 is a limit point of 


it 
eae 6 eee tee 
{1,55 3) mn” } 
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and of the interval [0,1]. But 0 is NOT a limit point of the set B = 
{0,1,2} (Why?). N and Z have no limit points in R! (Why?). Find 
all the limit points of Q in R! (Hint: the whole R is the set of all the 
limit points of Q, why?) 


THEOREM 12. (Cesaro-Bolzano- Weierstrass Theorem). Any infi- 
nite and bounded subset A of R has at least one limit point in R, i.e. 
there is an x € R and a nonconstant sequence {an} with a, € A for 
anyn=0,1,... , such that a, — x. 


PROOF. Since A is bounded, there is a closed interval [ao, bo] (ao, bo € 
R) which contains A. Let us divide this last interval into two equal 
closed subintervals and let denote by |az, b;] that subinterval which con- 
tains an infinite number of elements of A. Let x, be in [a,, b;| and in A, 
i.e. £1 € [a1,b,|NA. Let us divide now the interval [a,, b;]| into two equal 
closed subintervals and let us choose that one |d2, be] which contains an 
infinite number of elements from A. Let x2 be in AN|ag, bg] and x F 2}. 
We continue to construct subintervals [a3, 3], [a4, ba], ..., [Qn, bn], ... and 
elements zt, of AM [an,b,], such that t, ¢ {%1, %2,...,%n—1} for any 
n =3,4,...,n,.... Since the length of the interval [a,,, b,] is x, where 
1 is bp — ao, the length of the initial interval, we can use Cantor Axiom 
(Axiom 2) and find a unique real number x in the common intersection 


A an; bn] of all the intervals [an,b,]. Since x, and x are in [an, dn], 


dist(t,,X) < 34 80, t, — x (see Definition 1). Because x, n = 1,2,... 


are distinct elements of A, one has that x is a limit point of A and the 
theorem is completely proved. 


THEOREM 13. (Cauchy test 1). Any fundamental (Cauchy) se- 
quence in R is convergent in R, t.e. R is a complete metric space. 
This means that in R there is no difference between the set of conver- 
gent sequences and the set of Cauchy sequences (In Q there is!-Why?) 


PRooF. Let {y,} be a fundamental sequence in R. If {y,} has 
only a finite distinct terms then, from a rank on, the sequence becomes 
a constant sequence, so it would be convergent to the value of the 
constant terms. Let us assume that {y,} has an infinite number of 
distinct terms, i.e. that the set A = {y,} is infinite. Since A is bounded 
(see Theorem 10) and infinite, it has a limit point y (see Theorem 
12), ie. there is a nonconstant subsequence {y;,}, 2 = 1,2,... of the 
sequence {y,,}, which is convergent to y. We apply now Theorem 11 
and find that the whole sequence {y,,} is convergent to y. 
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This theorem has not only a great theoretical importance, but a 
practical one too. For instance, take again the sequence 


cos 1 cos 2 cos3 _ cosn 
= T T Tee. FT 


TE Ok FES a OS gn 
We proved that {x,,} is a Cauchy sequence. Now, we know (see The- 
orem 13) that it is also a convergent sequence to an unknown limit 
(we cannot express this limit as a decimal fraction!) «. Knowing that 
Ln — x is a very good situation! For a large n we can approximate x 
with x,. But this last one can be easily computed with an usual com- 
puter. So, we have a good idea about the limit. Moreover, the Cauchy 
test 1 is useful to check if a sequence is convergent or not. For instance, 
the sequence {a,,} is recurrently defined: a9 = 0, an = W/2 + Gn—1 for 


n=1,2,.... Let us prove that it is a Cauchy sequence. Indeed, 
(1.16) Qn — An-1 = V2 4 Gn = f2+ ans = 
An—1 — An—2 1 


< n—-1 — An—2)- 
Vf2 + An—1 4 V2 + An—2 34 : : 2) 


We can apply (1.16) (n — 1)-times and find 


1 
Gn — An < 5 (Ant — An—2) < 53 (On—2 — An—-3) <1. < peat ("1 — io). 
So, 


An+tp — An = An+tp — an+p—-1 ste An+p—1 — An+p—2 adstoot An+1 — An < 


1 1 1 
re aaa ie Ds at pn) (1 —ao) < 
1 1 1 1 
< pn (1 5 2 .) (G4 _ ao) = peat (1 _ ao) 


Here we just used that 


1 | prey 1 1 ibe 
Day egg LS eee ligg) e 


Since {a,,} is an increasing sequence (Why?), one has that 
1 
|antp — In| < prot (44 — ao), 
SO, |dn4p) — @,| can be made as small as we want when n — ov, inde- 


pendently on p. Thus, {a,} is a Cauchy sequence (see Definition 2). 
Hence {a,,} is convergent to a limit | (see Cauchy test 1). As we shall 
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see in the following theorem (Theorem 14), we can apply the "oper- 
ation" lim to the equality: a, = /2+a,_; and find: 1 = /2+1, or 
[| = 2. Therefore, lim a, = 2. 


Now, we describe some compatibilities of the "operation" lim (which 
associates to a convergent sequence its limit), with the algebraic op- 


erations "+”,” —”,”-”,” +”, with the order relation "< ”, with the 
functions 2”, %/x, expx, Inz, a”, log,,a > 0, sinz, cosz, tanx, cot x 
and with their compositions. This means, ... with all the elementary 


functions. We recall a basic definition: 


DEFINITION 5. Let (X,d,) and (Y,d2) be two metric spaces and let 
f :X —Y bea mapping defined on X with values in Y. We say that f 
is continuous atx € X (with respect to these metric space structures) if 
for any convergent sequence {%,} in X, {&%n} > 2, i.e. di(%p,x) 3 0 
as n — co, one has that the corresponding sequence of the images, 
{f(an)} is convergent to f(x) in Y, te. do(f(an), f(x)) — 0, when 
n — oo. If f is continuous at any « of X, we say that f is continuous 
in X. 


All the elementary functions (polynomials, rational functions, power 
functions, exponential and logarithmic functions, trigonometric func- 
tions and their compositions) are continuous on their definition do- 
mains. To prove this, it is not always so easy. For instance, what 
do we mean by 32? First of all, we define oa i= 2440 by the 


unique positive real root of the equation X™ — 3 = 0. Then we define 
n def 


an = (3) . By 3-7 we understand + Then, we approximate /2 
with an increasing sequence {r,,} of rational numbers, i.e. r, J/2 
and ry < Tn+1 for any n = 1,2,.... As we know, we simply take for r,, 
the rational number 1.,b9...b,, i.e. we get out all the decimals of J2 
from the (n + 1)-th decimal on. Now, by definition, 3Y? = lim 3. To 


nN CoO 


prove the existence of this limit is not an easy task. It is sufficient to 
prove that the sequence {3’"} is a Cauchy sequence. But,... even this 
one is difficult! So, the proof of the continuity of the power function 
x — 3° is not so easy at all! This is why we tacitly assume that all the 
elementary functions are continuous. 


THEOREM 14. Let {x} and {y,} be two convergent sequences to x 
and to y respectively. Then: 

b) {inYn} > xy, 

C) If Yn and y are not zero for anyn =0,1,... , then 1 — 1 

a) lf tgs, for any WHO sn. s HEN SY, 
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e) {(an)""} > 2™ for any fixed natural number m, 

f) in > Wa if m is odd and, for tn > 0, Win — Wax for any 
natural number m, 

g) {exp z,} — exp and, if x, > 0, then {Inz,} —- Ing, 

h) {a*} — a® and, if a, > 0,{log,%n} — log,x for any fixed 
a>O0, 

i) sn Zp, — sinx, coSx, — cosa, tanz, — tan zx, cotr, — cotz, 


PROOF. (partially) a) Let us prove for instance that {%, + yn} —- 
x+y. For this, let us evaluate the difference: 

Jen + Yn — (&+ 9) = [(4n — 2) + Yn — YS [tn — 2] + [yn — YI 
But |x, — x| — O and |y, — y| — 0, so their sum tends to 0 too (Why’). 
Thus, |%n + Yn — (x + y)| also goes to 0. 

d) Assume that x > y and take c = *;¥. Let us consider the open 
intervals: I = (y—c,y+c) and J = (x—c,x+ c). Since 2, — x and 
Yn — Y, for a large n one can find xz, € J and y, € J. But any element 
of I is less than any element of J. Hence y, < x, and we obtain a 
contradiction, because, for any n, one has in the hypothesis of d) that 
Ln S Yn- 

i) Let us prove for instance that sinx, — sinz, whenever ©, — . 
First of all we remark that |sina| = sin |a| for any a € (—}, 4). Since 
<n — x, one can take n large enough such that x, — x € (—3, §). If 
a is measured in radians and a € (—45,5) then, an easy geometrical 
construction (see Fig.1.2) tell us that sin Ja| < ja. 

Let us use now some trigonometry: 

Cee Ln tz 


|sinz, — sinz| = 2|sin 5 


so |sinz, — sinx| — 0, whenever x, — 2. 


Ln — ZL 
rig fae 


= |fn — =|, 


|Bc| = sin|a| < |BA|< lenght (arcBA) = |o.| 


Fig. 1.2 


CoROLLARY 1. Let f: A> Bandg: B >C (A,B,C are subsets 
in R) be two functions with the following property: If f(xy) > f(x) 
and g(Yn) — gly) for ANY convergent sequences {x,} to x and {yp} 
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to y, then (go f)(an) > (go f)(x). The functions f and g considered 
here are continuous on their definition domains in the sense of Defini- 
tion 5. So, the composition between two continuous functions is also a 
continuous function. Moreover,the sum, the difference, the product and 
the quotient of two continuous functions is also a continuous function. 


PROOF. Since f and g are continuous (see the definition in the 
statement of the theorem) then, 7, — x implies f(z,) — f(x) (con- 
tinuity of f). Since g is continuous, g(f(%)) — g(f(x)), ie. (go 
f)(@n) = (g° f) (x). Thus go f is also continuous. The other state- 
ments are easy consequences of some of the previous statements of the 
above theorem (prove them!). 


2. Sequences of complex numbers 


Let C be the complex number field. Since any element z of C is a 
pair z = (x,y) of two real numbers and since the element 7 = (0,1) has 
the property that i(y,0) = (0,y) (see the multiplication rule defined in 
(1.10)), we can write z = x+y, where we identify (x, 0) and (y,0) with 
x and y respectively. Let us fix a Cartesian coordinate system {O; i, j} 
in a plane (P). Here i and j are orthogonal versors and they give the 
directions and the orientations of the Oz-axis and Oy-axis respectively. 
Since any vector OM, where M is an arbitrary point in the plane (P), 


can be uniquely written as: OM = «xi + yj, where x,y € R, we call x 
and y the coordinates of the point WM. Write M(az, y). The association 
z=ax2+iy-— M(z,y) give rise to a geometrical representation of the 
complex number field C. This is way we always call C, the complex 
plane. The distance d between two complex numbers z; = 7; +7y; and 
Zq = Xo +7y2 is simply the distance between their corresponding points 
My(#1,41) and Mo(xo, y2) respectively, i.e. 


d(21, 22) “! V (x2 — 21)? + (ye — m1)? 
It is not difficult to check the three properties of a distance function 
for this d. 
A sequence {z,,} of complex numbers is said to be convergent to z 
if the numerical sequence of real numbers {d(z, — z)} is convergent to 
0. For instance, z, = ++ (1+ +)"4 is convergent to ei because 


d(2 4,64) = ve — 0)?+[(1+ ~)" — el? > 0. 


The sequence {2,,} is said to be fundamental (or Cauchy) if for any e > 
0, there is a natural number NV. (depending of ¢) such that d(Zn4p, Zn) < 
€ for any n > N, and for any p = 1,2,... . 
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The following result reduces the study of the convergence of a se- 
quence Z, = Lp + Yn in C to the study of the convergence of the real 
and imaginary part {x,} and {y,} respectively. 


THEOREM 15. Let {Zn = Xn + Yni} be a sequence of complex num- 
bers (here £, and Y, are real numbers). Then the sequence {Zp} is 
convergent to the complex number z = x+ yi of and only if x, — x and 
Yn — y as sequences of real numbers. 


PROOF. One has the following double implications: 


ln > % > d(2n, 2) = VW (2n — 2)? + (Yn —y)? 90S tn —2 > 0 


and y, — y — O (simultaneously), ie. if and only if x, — x and 
Un —~ Y. 


The sequence z, = 3+ (2nsin +)i tends to 3+ 2i because 3 — 3 
«4 
and 2nsin + = = — 2. 
THEOREM 16. Relative to the distance d, the complex number field 
C is complete, i.e. any Cauchy sequence {z,} of C is convergent to a 


complex number z. 


PROOF. Let 2, = %,+Yni, where x, and y, are real numbers. Since 
{Zn} is a Cauchy sequence if and only if d(zni , Zn) is as small as we 
want when n is large enough, independent on p = 1, 2,... and since 


Al Sri; Br) = i) (@atp -_ ae a (Yntp = alr 


one sees that |%n4p — Ln| and |Yn+p — Yn| are simultaneously small enough 
whenever n is large enough, independent on p. But this is equivalent 
to saying that {z,} and {y,} are both Cauchy sequences. Since R is 
complete (see Theorem 13), {z,,} is convergent to a real number x and 
{Yn} is convergent to another real number y. Let us put z = x + yi. 
Applying now Theorem 15 we get that z, is convergent to z. 


We say that a subset A of C is bounded if there is a sufficiently 
large ball B(0,r) = {z € C | |z| = d(0,z) < r}, with centre at 0 and 
of radius r > 0, such that A C B(0,r). We also have for C a Bolzano- 
Weierstrass type theorem. Namely, any infinite bounded sequence {z,, } 
of complex numbers has a convergent subsequence. If we add a symbol 
oo to C with similar properties like the infinite oo for R, we get C = 
CU {oo}, the Riemann sphere. It is easy to see that in C any sequence 
has a convergent subsequence. Because of this last property, we say 
that C and R are the "compactifications" of C and of R respectively. 
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Generally, in a metric space (A,d) a subset M is said to be compact 
if any sequence of M has at least a convergent subsequence with its 
limit in M. For instance, any closed interval [a,b] is a compact subset 
of R (because of Bolzano-Weierstrass Theorem). A subset C’ of C is 
said to be closed if for any sequence {z,,} of elements in C, which is 
convergent to z in C, its limit z is also in C. Then, the compact subsets 
of C are exactly the closed and bounded subsets of C (have you any 
idea to prove this?-try a similar idea like that one from the real line 
situation!) 


3. Problems 


1. Prove that the following subsets of R have the same cardinal: 
a) A= (0,1) and B=R, b) A=(0,1] and B=R,c) A= (-~, a) 
and B=R, d) A = (0,1) and B = (a,b), e) A = (a, co) and B = (0, 1], 
f) A= Qn (0, 3] and B = Qn [-7, 3]. 

2. Prove that sup(A + B) = sup A+ sup B and, if A, B C [0,00), 
then sup(A- B) = sup A-sup B, where A+ B= {x+y|xreEA, ye B} 
and A. B = {ay | « € A, y € B}. Define inf A and prove the same 
equalities for inf instead of sup. 

3. Construct R = RU {—oo,00} and prove that any sequence 
of elements in R has a convergent subsequence in R. Prove that if 
a sequence {x,,} is convergent in R, then it has only one limit point, 
namely the limit of the sequence. Find the limit points for the sequence 
dn = cost, n = 0,1,2,.... Recall that ¢ € M is a limit point of a 
subset A of a metric space (MV, d) if there is a nonconstant sequence 
{tn} of elements from A, which is convergent to x. 

4. Prove that if “++ — 1, where a, > 0 for any n, then ~/an, — I. 


(2n)! 
1-3-5-....(4n-41)? 


Apply this result to compute the limit: lim ” whenever 
nN — OO. 

5. Prove that the set R \ Q of irrational numbers is not countable. 
Prove that it has the same cardinal as the cardinal of R (i.e. there is a 
bijection between R \ Q and R). 

6. Prove that the length of the diagonal of a square which has the 
side a rational number, is not a rational number. 

7. Are V5 and \/3 rational numbers? Are they algebraic numbers? 

8. Prove that the metric space ([0,1),d), where d(x, y) = |x — y], 
is not a complete metric space, i.e. there is at least a Cauchy sequence 
{%n}, Ln € (0,1), which has no limit in (0,1). Prove that this limit must 
be 1. 
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9. Define the notion of "boundedness" in a general metric space. 
Is Cesaro’s Lemma (any infinite bounded sequence has at least a con- 
vergent subsequence) true in a general metric space? Find a simple 
counterexample. 

10. Why a decreasing sequence always has a limit in R? If instead 
of R you put Q = QU {—cx, oo}, is the last statement also true? 

11. Prove that the Archimedes’ Axiom is equivalent to the fact that 


lim 2 = 0. If instead of this last limit we put lim 22+3 = 2, does our 


3n—2 3? 
statement work too? 


CHAPTER 2 


Series of numbers 


1. Series with nonnegative real numbers 
We know to add a finite number of real numbers aj, a2, ..., dn : 
Sn = (... ((@1 + a2) + a3) +...) + Gn_1) + Qn) 
For instance, 
s4=74+3+4+(-4)+5=104+(-4)+5=6+5=11. 


However, we have just met infinite sums when we discussed about 
the representation of a real number as a decimal fraction. For instance, 


S 4 4 
= 3.3444... = 3.3(4) =3 ——— 
° 4) 10 ' 102 ' 103 
3 4 4 4 
= li 4 = 
Pe ag 102” 108 io") 
4 1— peo 1 
_ 33 | kin 10m 30 
10 10?n>0 1-5, 90 
Generally, if m and n are digits, then 
mn—m 
0 = 
m(n) 50 


(Prove it!). 
Since such infinite sums (called series) appear in many applications 
of Mathematics, we start here a systematic study of them. 


DEFINITION 6. Let {a,,} be a sequence of real numbers. The infinite 
sum 


(lt) Shao bay +... tant... 
n=0 


is by definition the value (if this one exists) of the limit s = lim Sp, 


where 8, = dj +a,+...+a, 1s called the partial sum of ordern. The new 
mathematical object defined in (1.1) is said to be the series of general 
term ad, and of sum s (if the limit exists). If s exists we say that the 
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series (1.1) is convergent. If the limit does not exist we say that the 
series (1.1) is divergent. 


For instance, the series 


ae i. We 1 
oan = LS ae a 


(oe) 
is convergent to 2, or its sum is 2, whereas the series }> n = ov, or 
n=0 


\> (—1)" are divergent. The last divergent series is said to be oscillatory 
n=0 
because its partial sums have the values 0 or 1, i.e. it oscillates between 


the distinct values {0, 1}. 


(oe) 


THEOREM 17. Let x be a real number. The geometrical series )> x” 
n=0 
is convergent (and its sum is —) if and only if |x| is less then 1. 


PROOF. By Definition _ 


ye lim (1l+a2+a°+...+2") = lim ——— 


a no 1-7 


Since lim x"*! exists and is finite if and only if |x| < 1 (when the limit 


n—Co 


(oe) 

is 0), the series 5) x” is convergent if and only if |x| < 1. In this last 
n=0 

case, its sum is s = lim ort = a For instance, if = 1, then the 


noo 1 
series becomes 1+ 1+1+...= 00 (in R). If x > 1, then lim x"*1 = oo. 
If « < —1, then the sequence {x”"t'} has no limit at all (why?) so 
lim & 


n—-oco 


1 . 
=—— also does not exist. 


THEOREM 18. (The Cauchy general test) A series S> dn is con- 
n=0 

vergent if and only if the sequence of partial sums {s,} is a Cauchy 

sequence, t.e. for any small real number ¢ > 0, there is a natural 


number N. such that 
Ge Oe ce Opits| SE 


for any n > N;z and for any p = 1,2.,.... 


PROOF. We only use the fact that R is complete, ie. that the 
sequence {s,,} is convergent if and only if it is a Cauchy sequence. 
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COROLLARY 2. (The zero test) If the sequence {a,} does not tend 


to zero, then the series Y~ ay, is divergent. Or, if the series S> dn is 
n=0 n=0 
convergent, then ay — 0. 


PRooF. If the series }> a, was convergent, then the sequence of 
n=0 

partial sums {s,,} would be a Cauchy sequence (see Theorem 18). Thus, 

for n large enough, an, = 8, — Sn -, becomes smaller and smaller, i.e. 


ad, — 0. In fact, we do not need the previous theorem. Indeed, let 


CO 
s= >> a, and write a, = 8, — S,_1. Then, lima, = s —s =0. 
n=0 


(oe) 
For instance, >> (aety" is divergent, because a, = ( nth)" —>e#0. 
n=0 


THEOREM 19. (The renouncement test) Let us consider the se- 


ries: }) dy, and S> ad, = an + an4i t+... (we just got out the terms 
n=0 n=N 

9,41, -.-,4y—1 in the previous series). Then these two series have the 

same nature (i.e. they are convergent or divergent) at the same time. 


Moreover, if they are convergent, then s = s' +a) +a, +... + an_1, 
(oe) (oe) 

where s = D> ayn and s' = YO ay. 
n=0 n=N 


PROOF. Let n be large enough (n > N) and let s,, = ag +a ,+...+ 
an-1+Gn+...+@n. If we denote si), = ay+...+@n, then s’, is the partial 
sum of order n of the series s’. It is clear that s, = si, +a9+ai+...+an_1 
and that the sequences {s,,} and {s’ } are convergent or divergent at the 
same time (prove it!). Now, in the last equality, let us make n — oo. 
We get: s = 8s’ + a9 +a, +... + ay_, and the proof is completed. 


Let 53> a, be a series with 


n=0 
. fe 
= ifn < 100 and a, = 57, if n > 100. 


The question is:"What is the nature of this series?" So we must decide if 
our series is convergent or not. Let us renounce the terms do, @1, ..., 4100 
in the initial series. We get a new series 


(oe) 


it 1 4 ale aa eile 
oy gn = gor (l Dea a +): 


n=101 
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Let us use now Theorem 17 and find that 


oo t. <2 100 - 101 1 
Dee paar sop D r 5. 3100" 
n=0 


THEOREM 20. (The boundedness test) Let S> an be a series with 
n=0 
nonnegative terms (a, > 0). Then the series is convergent if and only 
if the partial sums sequence {Sn}, Sn = a9 +a, +... + Gn, is bounded. 


(oe) 
PROOF. Let us assume that the series }* a, is convergent, i.e. the 
n=0 
sequence {s,,} is convergent. Since any convergent sequence is bounded 
(see also Theorem 10), one has that {s,,} is bounded. 

Conversely, we suppose that {s,,} is bounded. Since a, > 0, 5, < 
Snt1, Le. the sequence {s,,} is increasing. But Theorem 8 says that 
an increasing and bounded sequence {s,,} is convergent to its superior 

(oe) 


limit lim sup s,. Thus the series }> a, is convergent to this lim sup Sp, 
n=0 


i.e. its sum s = limsup Spy. 


THEOREM 21. (The integral test) Let c be a fixed real number and let 
f : |c, 00) — [0, 00) be a decreasing continuous function (see Definition 
5). Let ng be a natural number greater or equal to c. For any n > ng 
let an = f(n) and let A, = jie f(x)dx for n > no. Then the series 


S> dn is convergent if and only if the sequence {A,} is convergent (it 
n=no 


is sufficient to be bounded-why?). 


PROOF. Suppose that the series 5> a, = >> f(n) is convergent. 
n=no n=no 

Since in Fig.2.1 5, = f(no)+...+f(n) is exactly the sum of the hatched 

and of the double hatched areas and since the integral A, = he f(x) 


dx is equal to the area under the graphic of y = f(x) which corresponds 


to the interval [no,n], then A, < s,. Since 5> a, is convergent, the 
n=no 
sequence {s,,} is bounded, thus the sequence {A,,} is bounded. 
Conversely, let us assume that the sequence {A,,} is bounded. Look 
again at Fig.2.1! We see that the double hatched area is just equal to 
Ano +1 + Angp+2 ++» + On41 = Sn41 — Ang. Since this double hatched area 
is less then the area A,41 = oe f(x) dx, one has that the sequence 


{Sn41 — Gn, } is bounded. Hence the sequence {s,} is also bounded 
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(why?). Now, Theorem 20 tells us that the series }> a, is convergent. 


n=no 


Why we say that if lim f(x) 4 0, then the above series is divergent? 


i | : T ngs 
ie) 1 Cog Moth NOt? un. eeeeeeeeeeee n-1 on n+ x 


Fig. 2.1 


The integral test is very useful in practice. es that somebody 


is interested in the nature of the series > Let us apply the 


at nIn(n)* 


integral test and consider the associated decreaaing continuous function 
1 
f : [2,00) — [0, 00), f(x) = 


vinx 
(we simply put x instead of n in a, = aaa for n > 2). Since 


n 1 — 
A, = / mae” = In(In(x))|5 = n(Inn) — In(In(2)) — ~w, 


A,, is unbounded, thus our series is divergent (see Theorem 21). 

In the last 150 years one of the most interesting function in Mathe- 
matics, which was highly considered, is the Zeta function of Riemann. 
"Zeta" comes from the Greek letter ¢. The notation of this function 
was firstly used by the great German mathematician B. Riemann. Its 
analytic expression is: 


(12) (a)=>—0€ R 


This famous function is usually defined by a series. Thus, the maximal 
domain of definition for this function is exactly the set of alla € R 


with the property that the numerical series sal is convergent. We 


n=1 
call this last set, the set of convergence of our series. In the following, 


using the integral test, we find the convergence set for the Riemann 


(oe) 
(zeta) series S> =. 
n=1 


no 
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THEOREM 22. (Riemann zeta series) The Riemann zeta series is 
convergent if and only if a > 1. This means that the real definition 
domain of the function ¢ is the interval (1, 00). 


Proor. Let us take in Theorem 21 f(x) = 4 for x > 1. Since 


at! 1 
Ay i, —dz = ——_[n*"" - ll ifa 1 
i ee l-a 
and A, = Inn, ifa = 1, then A,, is bounded if and only if a > 1(why’?). 
el 


ne 


Now, Theorem 21 says that the Riemann series }> 


n=1 


is convergent if 


and only if a > 1. 


The sum 
a are | 
S te tg Dar, ¢(1) =o, 


(oe) 
because the series }> = is divergent for a = 1, thus the sequence of 
n=1 


partial sums 


| ai ai 7) 
is strictly increasing and unbounded. Hence s = lims, = oo. The 
Theorem 22 says that the series 


1 1 
(2)=ltotat- 
is convergent. So it can be approximated by 
1 1 1 


2 ae ery ee 


N2 
for N large enough. We call the series 5> 4 the harmonic series. It is 
n=1 
very important in Analysis. Sometimes the following test is useful. 
THEOREM 23. (The Cauchy’s compression test) Let {a,} be a de- 


creasing sequence of nonnegative real numbers. Then the series S> dn 
n=0 


and > 2"agn have one and the same nature, i.e. they are simultaneous 
n=0 
convergent or divergent. 
k m 
PROOF. Let s, = 55 a, and S,, = >> 2"agn be the k-th and the 


n=0 n=0 
m-th partial sums of the first and of the second series respectively. 
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Let us fix k and let us take a m such that k < 2” — 1. Then, 


Sp = Ag tay t+... tap < ao tay +... + Ggm_1 = ao + ay + (a2 + a3)4 


+ (a4 +5 +6 + a7) +... + (Ggm=1 + Agm—144 + Agm-149 +... + Gdgm_1) < 
< ag + a4 4 2a 2 ao2 re OP a sia =d9 + Sm-1, 
So 
(1.3) Sk < a9 + Smt 


Now, if the series }* 2”agn is convergent, then the increasing sequence 
n=0 
{Sin} is bounded. The inequality (1.3) says that the sequence {s;} is 


also bounded, thus the series > a, is convergent (see Theorem 20). If 
n=0 


S> dy is divergent, then the sequence {s;,} is unbounded. From (1.3) 
n=0 

we see that the sequence {S,,,} is also unbounded, so the series S = 
Y= 2” aon is divergent. 


n=0 


Assume now that m is fixed and let us take k such that k > 2™. 
Then 
Sh =Agtayt... + ap > Ag tay t+... + Gon = 


1 
wit (Gomi @gm—ty4-b staan) > a4 5H eet Daa aeSs 


1 
(ay + 2a + 27.a92 pets oe 2” agm) = z0m 


thus, 


(oe) 
If the series 5° a, is convergent, then the sequence {s;,} is bounded 
n=0 


and, using (1.4), we get that the sequence {S,,,} is also bounded (why’). 


Hence, the series 5> 2”dgn is convergent (why?). If $3 2”agn is diver- 
n=0 


n=0 
gent, then the sequence {S,,,} tends to co (why?) so, from (1.4), we 


38 2. SERIES OF NUMBERS 


[o-e) 
get that the sequence {s;} also goes to oo and thus, the series > ay, 
n=0 


is also divergent. Now the theorem is completely proved. 


We can use this test to find again the result on the Riemann zeta 
function ¢(a) = > + (see Theorem 22). Indeed, here a, = + and 


no 
n=0 


n . 
Gan = siz = (sk) - The series 


2" (g) - (ea) 
7=0 n=0 

is obviously convergent if and only if a > 1 (see Theorem 17). Thus, 
from the Cauchy compression test, we get that the Riemann series is 


convergent if and only if a > 1. 
Now, let us find all the values of a € R such that the series 


(oe) 


»» CSE is convergent. If in 7 


mics nye WE put instead of n, 2” and 


if we multiply the result by 2”, we get the series 


(oe) 


1 1 1 
2 = : 
d 2”(log, 2")* — (log, 2)° » ne 


n= 


Thus, the nature of our series is the same like the nature of the Riemann 
series. Therefore, our series is convergent if and only if a > 1. 
Another useful convergence test is the following: 


THEOREM 24. (The comparison test) Let x An and > by be two 
=0 n=0 
series with On 2 0; bn > 0 anda, < by for n = 0,1,2,.. .. a) If the 


sertes 3 b, 7s convergent, then the series ~ ay is also convergent. b) 


n=0 n=0 


If the series > dy, is divergent, then the series )> by, is also divergent. 
n=0 n=0 


PROOF. Since ad, < b, for n = 0,1, 2,..., then 
8 beet dee S Ba te Oy cen eh aes 


the partial n-th sum of the series }> b,. a) If the series 5° b,, is conver- 
n=0 n=0 
gent, the sequence {u,,} is bounded. Hence the sequence {s,,} is also 


bounded, and so the series 5> a, is convergent (see Theorem 20). b) 
n=0 


If the series 5) a, is divergent, then the sequence {s,,} is unbounded 
n=0 
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(see Theorem 20). Hence the sequence {u,,} is unbounded (why?), so 


the series )> b, is divergent. 


n=0 


< i 


(oe) 
For instance, the series 5° is convergent because 3 


ei oe 2 
n2+7 n2+7 
n=0 


and because the series )> 4+ = Z(2) is convergent (see Theorem 22). 


n=0 


The comparison test is also useful in proving the following basic 
convergence test (see Theorem 25). 

First of all we remark that the natural way to add two series is the 
following 


(1.5) Yon Doty y (an + bn). 


It is easy to see Pas if aS both series are convergent, then the 
resulting series on the right is also convergent (prove it!). If a,b, are 
nonnegative then, if at least one series is divergent, the series on the 
right in (1.5) is also divergent (prove it!). In general this is not true. 
For instance, s n+ S (—n) = 0! 


n=0 =0 
Now, if A isa real : number, by definition, 


wae 
n=0 =0 


If \ = —1, we can define the subtraction: 
Yah = wt 


For \ # 0, the series > ay, and A s a, have the same nature (prove 


=0 n=0 
it!). Pay attention to the following wrong calculation: 


(oe) (oe) 


1 1 1 
Yeast east a 


n=0 


The series on the right side is convergent, but on the left side we have 
oo — oo, an undetermined operation, so it cannot be equal to a deter- 
mined one! 


THEOREM 25. (The limit comparison test) Let S*> ap, and > by 


n=0 n=0 
be two numerical series of real numbers such that an > 0 and b, > 0 
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for any n = 0,1,2,.... Suppose that the sequence {} is convergent 


tol € RU {oo}. Then, a) if 1 # 0,00, both series have the same 


nature (they are convergent or not) at the same time, b) if1 =0, >> by 
n=0 
convergent implies S> an convergent and, c) if l= oo, >> bp, divergent 
n=0 n=0 
implies Y> ay, divergent. This is why the series )~ by, is called a witness 


‘ n=0 n=0 
series. 


PROOF. a) Since | 4 0,00, | > 0, so there is an ¢ > 0 such that 
1—e> 0. Since lim # = J, there is a natural number N (depending 


noo“? 


on €) withl-e< aa <Jl+e for any n > N. Because of the last double 
inequality and since b,, > 0, one can write 


(1.6) (1 —€)byn < an < (I+ )bn, 
for any n > N. Now, if for instance, 5> a, is convergent (this means 
n=0 


that the series 5° a, is also convergent from Theorem 19) then, using 
n=N 


the inequality a — €)by < Gp, and the comparison test (Theorem 24) 


we get that the series (J — ¢) 55 b, is convergent. Since 1—e # 0 
n=N 


we finally obtain that the series 5) b, is convergent, i.e. the series 
n=N 


b, is convergent (see the renouncement test). If this last series is 
n=0 
convergent, using the second inequality, a, < (1+ <¢)bn, from (1.6), one 


gets that the first series }> a, is convergent (complete the reasoning!). 
n=0 

b) If 1 = 0, take an ¢ > 0 and take a natural number N, (depending 

on €) such that for any n > N; we have 0 < ee <€ Or Gy < cbpy. If the 


(oe) [o-e) 
series )> b,, is convergent, then the seriese 5° b, is also convergent, so 
n=0 n=Ny 


(oe) 
the series )*> a, is convergent (see the comparison test). Using again 
n=N1 


(oe) 
the renouncement test we get that the series }°> a, is convergent. c) 


n=0 
If 1 = oo, take a positive real number M > 0 and take a natural 
number N2 (depending on M) such that for n > No, f > M, or 
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an > Mb,. Now, if the series 5° 6, is divergent, then the series S> by 
n=0 n=N2 
is also divergent (see Theorem 19). Use the inequality a, > Mb,, to 


obtain that the series 5° a, is divergent (see the comparison test). 
n=No2 


Using again the renouncement test we get that the series 5) a, is 
n=0 


divergent. 


Let us decide if the series ae is convergent or not. We intend 


a 


to use the limit comparison ne with a, = vn and by, = =z. We try 


to find an a such that the limit / = lim é be finite sane nonzero. If we 


can do this, such an a is unique. Its value is called the "Abel degree" 
of the function f(x) = va So, 


w2+4° 
i 
An i) 
t= lim > = li Saran 7 OO 
n— 00 noon?(1 4 5) 


(= 1) if and only ifa+ 3 =2,ie. 2 >1. Since the series > y= Z(§ ) 
is convergent (see the eee Zeta series), from the hae comparison 


test one has that the series y Vi ig convergent. Applying again the 


n2+4 


renouncement test we get that our initial series = is convergent. 


on 

=ou 4 

Let us put in a systematic manner all the reasonings in this last 
example. 


THEOREM 26. (The a-comparison test) Let 2 Gn be a series with 
=0 
nonnegative terms (a, > 0). We assume that there is a real number a, 


such that the following limit does exist: lim n°a, =1 € RU {oo}. a) If 


| #0,0co then, the series > a, is convergent if and only if a > 1. b) 
n=0 


Ifl=0 anda > 1, then our series S* a, is convergent. c) If l = 
n=0 
and a <1, then the series Y* a, is divergent and equal to co. 
n=0 
PROOF. It is enough to take b, = a in the Theorem 25 (do every- 
thing slowly, step by step!). 
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Let us apply this last test to the following situation. For a large N 
(> 100, for instance), can we use the approximation 


pei 9 n+in+1 , 
Vni + 2n+2 . 


We can do this if and only if our series is convergent (why?). In order 
to see if our series is convergent or not, let us consider the limit: 


n=0 


4 W++1 _ ne(14+ 543) a, EE 
lim n* ——————. = lim = lim —; 
RAB AORTA ee, OE ee ol ite acme OO ape 


But, Me last limit is neither 0 nor oo, if and only ifa+3 = 2, or 
aS 3 (why?). Since in this case a > 1 and the limit / is 1, we apply 
the a-comparison test (Theorem 26) and find that our initial series is 
convergent. Hence the above approximation works! 

A very useful test is the ratio test or D’Alembert test. 


THEOREM 27. (the ratio test) Let S* a, be a series with positive 
n=0 
terms. 


a) If there is a real number X such that 0 < A < 1 and *** < 
for any n > N, where N is a fixed natural number, then the series 1s 
convergent. This is equivalent to say that lim sup ae seal 

b) If a > 1 for anyn > M, where M is a fixed natural number, 
then the series is divergent. 

c) Iflimsup S* = 1, and if “= is not equal to 1 from a rank on, 
then, in general, we cannot decide if the series is convergent or not (in 
this situation use more powerful tests, for instance the "Raabe-Duhamel 
Test"). 


PROOF. a) Let us put n = N,N +1,N +4 2,... in the inequality 
anit <<), We find: 


2 m 
ans1 < Aan, ani2 < Aaya < A"aN,..., ON4m < AON, . 


Hence, 
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So any partial sum of the series }> a, is bounded. Since a, > 0, 
n=N 


(oe) 
the series }> a, is convergent (Theorem 20). The renouncement test 
n=N 


says that the whole series 5B dy is also convergent. 


b) If @** > 1 for any n M, then 


ed SE SES ee Oe tayut...+ayt+..=O, 


so the series }) a, is divergent (explain everything slowly, step by 


n=0 


step!). 


(oe) 
c) For instance, the harmonic series )> + is divergent, but 
n=1 


1 


lim sup—— ee — 


— 
noo n 


This last property is also true for the series }°> = but this last series 
n=1 
is convergent! This is why we cannot say anything in general if one can 


find numbers of the form ae < 1 as close as we want to 1. 


REMARK 5. The condition from a) of Theorem 27 is equivalent to 


saying that lim sup “** < 1 (why?). If the sequence { au | is conver- 


gent to l, then the Theor 27 is more exactly. Namely, in this last 


case, the series )*> ay, is convergent if | <1, it is divergent ifl > 1 and 
n=0 
if |= 1 we cannot say anything (prove it!). 


For instance, the series ore = is convergent because lim “++ = 0 < 


n=0 M200! SSE 
1 (see Remark 5). 
Usually, if Tim 


auth 


= 1, we try to apply the following "more pow- 
erful" test. 
THEOREM 28. (The Raabe-Duhamel test) Let S~ ay, be a series with 


n=0 
positive terms. 


a) If there is a real number X € (1,00) and a natural number N such 


that n (24, _ 1) > X for any n > N, then the series is convergent. 


b) If n (2. ) < 1 forn > M, where M is a fixed natural 
number, then the series is divergent. 
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c) Assume that the following limit exists, lim n ( 28 -— 1) =IeE 


n—0o n+l 
RU {oo}. Then, ifl > 1, the series is convergent, if | <1, the series is 
divergent and if |= 1, we cannot decide on the nature of this series. 


One can find a proof of this result in [Nik], or in [Pal]. See also 
Problem 11 of this chapter. 
Let us find the nature of the series 


is seoeve Ont. 

o1 2-4-6+..:2n n+ 3" 
Since 

Qn+1 (2n + ag 


= — 1 


Ga (2n + 2)(2n + 5) 
let us apply Raabe-Duhamel test. Since 


Qn 4 Qn? +n 1 a4 
n ea ee = 
An+1 (2n + aye 2 : 


the series is divergent. 


(oe) 


THEOREM 29. (The Cauchy root test) Let S* ay be a series with 
nonnegative terms. "oe 

a) If there is a real number \ € (0,1) such that x/an < A forn > N, 
where N is a fixed natural number, then the series is convergent. 

b) If x/a, = 1 for alln > M, where M is a fixed natural number, 
then the series is divergent. 

c) Assume that the following limit exists, lim x/a, = | € RU 


{oo}.Then, if | < 1, the series is convergent, if 1 > 1, the series is 
divergent and if |= 1, we cannot decide on the nature of this series. 


PROOF. a) The condition ¢/a, < » for n > N implies 


Gn bani Patt Ghee. < oy OS OS Oe JS 


ra < an 
i 


ot ra 


x? 


(oe) 
so, the partial sums of the series 5) a, are bounded. Hence the 
= n=N 
series }> a, is convergent (see Theorem 20). From the renouncement 
n=N 


(oe) 
test we derive that the series 5° a, is convergent. 
n=0 
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b) The condition ~/a, > 1 for n > M, implies a, > 1 for an infinite 
number of terms, so {a,} does not tend to zero. Hence the series is 
divergent (see Corollary 2). 

c) Take « > 0 such that |+e < 1. Since %/ay — I, there is a natural 
number N such that ifn > N, wan < 1+. Apply now a) and find 
that the series is convergent. If / > 1, there is a rank M from which 
on %/a, > 1 for n > M and so, the series is divergent (see b)). If 
| = 1, there are some cases in which the series is convergent and there 
are other cases in which the series is divergent. For instance, the series 


>> = is convergent and / = lim 7/4 = 1 (since 7/n — 1; prove this! 
n=1 72° 
Hint: 
n n(n — 1) 2 
a, = Yn-1 = n=(1t+a,)” =1+na,4 5 On te > 
n(n—1) 5 2 
ag Saag <4) 
ee n—1 


CO 
so, a, — 0. But the series }> 4 is divergent and /= lim ? 4 ale 


n—-OCo 


The series x Ge + yn is convergent because ?/a, = mn < 5 for any 
oe ee eae just applied the Bae Root Test, a)). We can also 
apply the (Comparison Test: Gin aa <3 for any 7 = 1,2:... ete: 


REMARK 6. A natural question arises: what is the connection (if 
there is one!) between the ratio test and the root test? To explain 
this we need a powerful result from the calculus of the limits of se- 
quences. This is the famous Cesaro-Stolz Theorem: Let {a,} be an ar- 
bitrary sequence and let {b,,} be an increasing and unbounded sequence 


Qan+1—an 


ae \ as convergent to 


of positive numbers such that the sequence { 


le R=RU{-oo, oo}. Then * sz — 1. A direct consequence of this result 
is the Cesaro Theorem: Let oe be a convergent to | sequence. Then 
the "means" sequence { ore nt | is also convergent tol (prove it as 
an application of the Cesaro-Stolz Theorem). We prove now that for a 
sequence {ay} of positive numbers, such that the limit of the sequence 


{ 24} does exist inR, then fens} — 1 if and only if { %/a,} — 1. Sup- 


pose that {su} — 1, then naj, —Ina, — Inl, or eee — Ini. 


From the Cesaro-Stolz Theorem we get that dn = In v/a, — Inl, or 
*/an — l. Conversely, assume that { %/a,} > A and that {su} — I’. 
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/ 


From the first implication, one has that | = I’ and the statement is 


completely proved. 


(oe) 
Suppose we have a series }> a, with a, > 0 for any n > N, such 
n=0 


that { 22} — 1. We cannot decide on the nature of this series. Re- 


an 
mark 6 says that it is not a good idea to try to apply the Cauchy Root 
Test because this one also cannot decide if the series is convergent or 
not. 


2. Series with arbitrary terms 


Up to now we just considered (in principal) series with nonnegative 
terms. If the number of positive or negative terms in a series are finite, 
to decide the nature of this series, it is sufficient to get out those terms 
and thus to obtain a new series with all its term positive or negative 

(oe) 


(see the renouncement test). If a, <0 in a series }> a,, we consider 


n=0 
(oe) (oe) 
the new series }>(—a,) = — > a, and apply the results obtained in 
. . . cae 1 8 1 . 
the previous section. For instance, 2 a8 =— yi -; is convergent, 
n= n= 


(oe) 
because )> 5 is convergent (it is the value of the Riemann series for 
n=0 


(oe) 

a= 3> 1). A numerical series > a, is said to have arbitrary terms if 
n=0 

the sign of its terms a, may be positive, negative or zero, but not all 


(or a finite number of them) are of the same sign. We also call such a 
series a general series. The Cauchy general test (see Theorem 18) and 
the zero test are the only tests we know (up to now) on general series. 
Here is another important one. 


THEOREM 30. (The Abel-Dirichlet test) Let {a,} be a decreasing 
to zero (dn — 0) sequence of nonnegative (a, > 0) real numbers. Let 
S> by, be a series with bounded partial sums (i.e. there is a real number 
n=0 
M > 0 such that for s, = bp +b, +... + bn, one has |s,| < M, where 


n=0,1,...). Then the series S> nb, is convergent. 
n=0 


PROOF. We intend to apply the Cauchy general test (Theorem 18). 
Let us denote S;, = aobo9 + a,b, +... + a,b, the n-th partial sum of the 
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CO 
series 5° a,b, and let us evaluate 
n=0 


ISn+p oa S| = |@n41On41 ae An+p0ntp| = 
= lan41 (Sn44 _ Sn) ate An42(Sn+42 _ Sn41) aides qr Crp suie _ Snip) = 


—An415n An+1 — 4n42)$n41 ate An+p-1 _ An+p Sn4 p-1 On4 pon tp 
| =P )Snpit. + ( ) cf | 


(2.1) 
< An41 [Sn|+(@n41—An42) | Saal Passh(Gices i= One) [Sie 1]+@n4p Sick | . 


Let ¢ > 0 be a small positive real number. In the last row of (2.1) we 
put instead |s;|, 7 =n,n+1,...,.n+ p, the greater number M. So we 
get 
(2.2) 
ne — S,| < M (@n41+@n+1—Gn4+2+@n4+2—Gn43t---+@n+p—1—OntptG@n+p) 
= 2Mans1 

Since {a,,} tends to 0 as n — oo, there is a natural number N (which 
depend on <) such that for any n > N, on has that 2Manj41 < ¢. Since 
[Snip — Sn] < 2Man41 (see (2.2)), we get that |S,4, —5,| < e for any 
n > N. This means that the sequence {5;,,} is a Cauchy sequence, i.e. 
the series 5> a,b, is convergent (see Theorem 18) and our theorem is 


n=0 
completely proved. 


The following test is a direct consequence of the Abel-Dirichlet test. 


COROLLARY 3. (The Leibniz test) Let {a,,} be a decreasing to zero 
(an — 0) sequence of nonnegative (a, > 0) real numbers. Then the 


series 
CO 


si 
(—1)" "an = a1 — ag +03 -... 
n=1 
is convergent. 


For instance, applying this test, we get that the series }> (—1)” 


n24+3 
n=1 


— 3 (-1)""' 345 is convergent (do it!). 
n=1 
A famous example is the standard alternate series 


1 ies St 
2 eg ae ee ee ee 
(2.3) wey eee 


nt+1 _ 
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This series is a general series (why?) and it is convergent. Indeed, 
{an = +\ is a decreasing to zero sequence with nonnegative terms so, 
we can apply the Leibniz test and find that the series is convergent. 


(oe) 
DEFINITION 7. (absolute convergence) A series S* dn is said to be 


n=0 


absolutely convergent if the series of moduli S> |ap| is convergent. 
n=0 


For instance, the series > (—1)"5 is convergent (why?) and ab- 
n=1 


solutely convergent, but the series > (—1)"+ is convergent (why?) and 
n=0 


it is not absolutely convergent, because the harmonic series }> 4 = 
n=1 
Z(1) = oo (see the Riemann series). A series which is convergent, but 
not absolutely convergent, is called semiconvergent. 
The following result says that the notion of absolutely convergence 
is stronger then the notion of (simple) convergence. 


[o-e) 
THEOREM 31. Any absolute convergence series Y> dy, is also (sim- 
n=0 
ple) convergent. 


PROOF. We use again the Cauchy General Test (see Theorem 18). 


Let 5s) = a9 +a, +... + dy be the n-th partial sum of the initial series 


ay, and let S,, = |ao| + |ay| +... + |a,| be the n-th partial sum of the 


n=0 


(oe) 
series )> |a,|. Let us evaluate 
n=0 


(2.4) lSntp = Sn = |an41 + On42 7 oF An+p| ad 


Janta] + [ang2| +... + [ante] = |Sr+p — Sn] - 


Let ¢ > 0 be asmall positive real number and let N be a sufficiently 
large natural number such that for any n > N one has |Sn4, — Sn| < € 
for any p = 1,2,... (since {S,} is a Cauchy sequence). From (2.4) we 
have that |Snip — Sn| < |Snip — Sn|, 80 |Snip — Sn| < € for any n > N 
and for any p = 1,2,.... But this means that the sequence {s,,} is a 


(oe) 
Cauchy sequence. Hence the series }> a, is convergent (see Theorem 


n=0 
18). 


2. SERIES WITH ARBITRARY TERMS 49 


sin(57) 


(oe) 
For instance, the series }) =" is convergent because it is ab- 
n=1 


sin(5n) 
2 


solutely convergent. Indeed, since 


>> 4 = Z(2) is convergent (see the Riemann series), the Comparison 
n=1 


< 7 and since the series 


co > 
Test says that the series of moduli )> ee is convergent, i.e. the 
n=1 
Co. 
initial series es a 
n= 


is convergent. 


REMARK 7. (see [Nik] or [Pal]) We saw above that any absolutely 
convergent series is convergent, but the converse is not true. Cauchy 
proved that in any absolutely convergent series one can change the order 
of the terms in the infinite sum (by any permutation) and the sum of 
the series remains the same. On the contrary, Riemann proved that 


(oe) — 
for a semiconvergent series S> an, and for any number A € R= RU 
n=0 


{—00, co}, one can find a permutation of the terms of the series )> an 
n=0 
such that its sum becomes exactly A. Two absolutely convergent series 


can be multiplied by the usual polynomial multiplication rule 


a, . a = s Cn, Where Cy = Agbyn + aybyp_1 +... + Ando, 
n=0 n=0 


n=0 
and the resulting product series is again absolutely convergent (Mer- 
taens). 


REMARK 8. If instead of series with real numbers we consider a 
[o-e) 


series with compler numbers Y* zn, where Zn = In+tYn, Ln, Yn € R for 
n=0 
any n = 0,1,2,..., we say that such a series is convergent to its sum 


s=uti, u,v € R of the sequence of partial sums 


Sn = 29+ 21+... + Zn = (Cott t+... + fn) + i(yo + yi + + Yn) 
is convergent to Ss, 1.€. 
|S — $n] = Vlu— (@o+ 21+... + an)? + lv - (Yot yi +--+ Yn)? > 0, 
when n — oo. This is equivalent to saying that both series with real 


numbers, \> t, (the real part) and Y> yp, (the imaginary part) are con- 
n=0 n=0 


vergent to u and v respectively. Hence, >> 2n = Y> tn +% >> Yn and 
n=0 n=0 n=0 
the calculus with complex series reduces to the calculus with real series. 
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Practically, in general, it is difficult to decide if both the "real part" 
and the "imaginary part" are convergent. For instance, let us consider 
the series 


Vero ib ae oe ae hoes Bhs 
ey Ge) ee 2 (+ +iss) V2” (cos 4 + isin =) 
as >» aiid = > al = >: a : 


Let us use now the Moivre formula and find: 


V2" cos nt. ae v2" sin nt 
=> oe 
n! 


n! 


Since 
V2" cos nz e Jf2nr 
n! — mm! 
and since 
Jatt 
(n+1)! 
ae 


V2 2 cosn= Wie COS Thea 


the series > is absolutely convergent, so it 1s convergent 


=0 
Ce the or that we used!). In the same way we prove 


that the imaginary part series > es is also convergent. An eas- 
n=0 
ier way to prove the convergence of the complex series s = )~ ae 
n=0 
is the following. It is not difficult to prove that an absolutely conver- 


gent series S> zy, (i.e. >> |Zn| is convergent) is also convergent (see 
= =0 


the proof of Theorem 31). In our case, 


(ea Ea a7 
n! onl ont 
So, the series 4) |zn| = >> van is convergent (use the ratio test), 
n= n=0 
i.e. the series s = )° es is absolutely convergent. Hence, it is 
n=0 


convergent. If a series > Z, is not absolutely convergent, the general 
n=0 
way to study it 1s to write it as: 


oo oo oo 
n=0 n=0 n=0 
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and to study separately the real series > Ln and > Yn- If both of them 
=0 
are convergent, the initial series 1s also ceveiden: If at least one of 


them is divergent, the series Zn, is divergent (why?). 
n=0 
3. Approximate computations 


Usually, whenever one cannot exactly compute the sum of a con- 
(oe) 


vergent series s = )> a, one approximate s by its n-th partial sum 
n=0 
Sn =A9 +a, 4+... + Gn, for sufficiently large n. For instance, 


1 1 1 1 
8 =) SF S100 = LP 22 ri... 10002" 


The difference ¢,, = |s — s,,| is called the (absolute) error of order n in 
our process of approximation. It is clear enough why we are interested 
in the evaluation of this error. Since the series is convergent, €, — 0, 
when n becomes large enough. Given a small positive real number 
€ > 0, the problem is to find an n (very small if it is possible!) which 
depend on ¢, such that the error ¢, < ¢. For instance, if ¢ = we 
say that "s is approximated by s,, with 3 exact decimals". 

We study this problem in two cases. 

Case 1 Let s = )>*, a, be a series with positive terms (a, > 0, 
n = 0,1,...) and let a € (0,1) such that “** < a for n > N (remember 
yourself the Ratio Test). The series is convergent (see Theorem 27). 
Let now & be a natural number greater or equal to N. Let us evaluate 
the error €, = S — Sp: 


1 
103? 


a 


(3.1) Ek = Qk41 + Apge +... < aay + a? apt+.. Qk 


=i —a 

We see that if ¢ > 0 is an arbitrary small positive real number, always 
one can find a least k € N such that ~[,ax < €. Since €, < 7S5ak, for 
this & one also has: ¢€, < ¢. If we want a small k, we must find a small 
a € (0,1) such that for a small N (0 if it is possible), we have “* <a 
forn > N. 


Let us compute the value of De (we shall see later that it is 


=0 
exactly e, the base of the Nepeiian igeeithan) with 2 exact decimals. 
Since att = aS < § forn > 1, 


L 
2 


1 
—s-— Sj ae =o 
Ek 8 Sk il il 
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Let us find the least & such that A LoS oe By trials; k= 12,235 
we find k = 5. So 
1 1 1 1 1 
PRR la tap ak at ape Os 
i.e. we obtained the value of e with 2 exact decimals, e ¥ 2.71. 

Let s = )oa, be a series with nonnegative terms (a, > 0, n = 
0,1,...) and let a € (0,1) such that w/a, < a for n > N (remember 
yourself the Cauchy Root Test). The series is convergent (see Theorem 
29). Let now & be a natural number greater or equal to N. Let us 


k+1 . . . 
evaluate the error €, = s—sx. Prove that ¢, < S—. Use this estimation 


to find the value of s = S> 2 with 3 exact decimals. 
n=1 
Case 2 Suppose now that we want to approximate the value of an 


n-1 


alternate series, s = )>(—1)""*a,, where {a,,} is a decreasing sequence 


n=1 
with nonnegative terms and a, — 0. The Leibniz test (see Corollary 
3) says that our series is convergent. Since 
S2n = San—2 + (Gan—1 — Gan) = $2n—2 
and since 
SInt1 = San—1 — (Gan — Gant1) S S2n-1; 
one has: 
(9.2): 2830 sa Gers se, S Sone ORS Ss Soya Sa Ss sy 


So, 


O< 8 — Son < Sonq1 — San = Aon4i 


and 

O < Son4i — 8 S Santi — Sante = Aen42- 
Hence 
(3.3) En = |8 — Sn| < Gn41 


i.e. the absolute error is less or equal to the modulus of the first ne- 
glected term. Here, in fact we have another proof of the Leibniz Test 
(see Theorem 3). This one is independent of the Abel-Dirichlet Test 
(Theorem 30). It uses only Cantor Axiom (Axiom 2) (where?). 

Let us compute s = > (-1l)"" aie with 2 exact decimals. We use 

n=1 
the estimation (3.3) and force with 
1 1 


n+1 = < 
“41 Tint DIP ~ 102 
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for n > 3, so 


4. Problems 


ds Compute the sum of the following series: 
an-l43" VSN dae 1 
a) In (ie az) 3 ops prvi 3 yd, n(n+2)? Dy n(n+1)(n+2)? 


00" n n—-1 
e) = SET f) Sy eae 
2. Decide if the following series are convergent or not: 
a) >; by) HS) 1 ey (-1)"4; d) > gerta”, a > 0 


nl? A 1-5-9-....(14+4n) n? m+T 44 


n=0 n= n=0 


(discussion on q@); e) ae (204)" (discussion on a € R); ), a 
8) a see h)S> ae , (discussion on a > 0); >> 4 (20 = 


n=0 n=1 


(De. 
10"n!? 


1)", (discussion on A € R); do (to) sr, @ > 2 (discussion on a); 


k) = Tas (discussion on a); 1) = ae op ee 1)”, (discussion on 


a> 1); m) > Ta Vani’ n) 3 50) 0 Gays PL Sa lea 


n=0 


[o-e) 
wy 3 es s) ata” (discussion on a > 0). 


o Find the Abels degree of the expression E = Seas 
neN. 


4. Use the a-Comparison Test to decide if the series » sin (; 


wn wn) 
is convergent or not. 


5. Find all « € R such that the series ae wa mt x” to be convergent. 


What about all x € C such that the same one is convergent? 
6. Find all z in C such that the following series are absolutely 
convergent. 


a) > 4; b) > EB o) NK nes a) (z — 31+ 2)" 


n=0 n=1 n=0 n=0 


7. Draw the set M = " ER| O(-1)"4 is convergent | on the 


37 


3 
ll 
mn 


real line. 
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n3” 


8. Draw the set U = ‘2 EC| 5 (-1)"& is convergent | in the 


complex plane. 


9. Compute > (—1)" with 2 exact decimals. 


n=1 

10. Compute >> = with one exact decimal. 

n=1 
11. Prove the Raabe-Duhamel test. Hint: 
a) Write: 

Nay — (N + l)anyi > (rv a l)any1 

(N + l)any1 = (N + 2)an42 > (rv _ l)anye 
(N + p)antp — (N ++ lansprt 2 A - Dantp+i 
Sum these inequalities on columns and get: 


Nay—(N+p+1)an+p+1 2 (A-1) lanai + Qo + Qnys3 +... + Outpt 
So 

Nan 

Tad > Gn41 + Gni2o + Qnag t+... + @n+psi 
for any p = 1,2,.... Hence, the partial sums of our initial series are 


bounded. Thus the series is convergent. 
b) Since nay < (n+ 1)an41 for n > M, the limit Jim 0 NAn is greater 


than 0. So, using the a-comparison test for a = ni we get that our 
initial series is divergent (why?). 

c) Apply a) and b). 

12. Compute °~, = with 3 exact decimals (use the approximate 


computation with the Root Test). 


CHAPTER 3 


Sequences and series of functions 


1. Continuous and differentiable functions 


Recall that a metric space is a set X with a distance d on it. A 
distance d on X is a function which associates to any pair (x,y) of X 
a nonnegative real number d(x, y) with the following properties: 

dl... d(a@;y) = 04f and-only if 7 =y. 

d2. d(x, y) = d(y,x) for any x and y in X. 

d3. d(z,y) < d(x, z)+d(z,y) for any x,y and z in X. 

See also the Remark 2. We usually denote by (X, d) a metric space 
X with a distance d on it. The standard example of a metric space 
is (R, d), where d(z,y) = |x —y|. We say that x, — x in (X,d) if 
the numerical sequence {d(x,,x)} tends to zero, i.e. if the distance 
between z,, and x becomes smaller and smaller to zero as n — oo. We 
define again the basic notion of continuity. 


DEFINITION 8. (continuity of a function at a point) Let (X,d), 
(X’,d') be two metric spaces, let f : X — X’' be a function defined 
on X with values in X' and let x be a fixed element in X. We say 
that f is continuous at x if for any sequence {x,} which converges to 
x, we have that f(t) — f(x). For instance, if X = X' = R, with 
the usual distance, f is continuous at a point x if the graphic of f is 
not "broken (or interrupted)" at x (see Fig.3.1). All the elementary 
functions (polynomials, rational functions, power functions, exponen- 
tial functions, logarithmic functions, trigonometric functions) and their 
compositions are continuous on their definition domains, i.e. in any 
point of their definition domains (see also the Theorem 14). Hence, the 
continuity is essentially a "local" property, i.e. its definition shows the 
behavior of the function f at a given point x. 
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y y =x) 'y = f(x) 


ie) x4 X2 xX 
continuous noncontinuous 
(noninterrupted) (interrupted) 
Fig. 3.1 


For instance, a) f: R-—R, f(x) = wel is continuous on the whole 
R. Indeed, let a be a fixed point in R and let {a,} be a sequence 
convergent to a. Then, using the basic properties of the convergent 
sequences relative to the elementary algebraic operations (+, —,-,:, see 


the Theorem 14), we find that 

aj+1  a?+1 
— = 

az+1 a?+1 


f(@n) = 


i.e. the function f is continuous at a, for any a € R. Hence f is contin- 
uous on R. Now, if we compose the function In x (which is continuous 
x41 
x2+1 


on (0,00)) with f(a) we get a new continuous function g(x) = In 
on (—1, 00) (why?). 


REMARK 9. We need in this chapter another basic "local" notion, 
namely the notion of differentiability of a function f at a given point 
a. Recall that a subset A of R is said to be open if for any point a 
of A, there is a small positive real number ¢, such that the interval 
(a—c,a+e) (the "ball" with centre at a and of radius ¢, usually called 
the e-neighborhood of a) is completely included in A (define the notion 
of an open subset in a metric space (X,d); instead of ¢-neighborhoods 
use open balls B(a,e) = {x € X : d(x,a) < e}, etc.). A subset B of 
R is said to be closed if its complementary R\ B is an open subset (B 
is closed in an arbitrary metric space (X,d) if X \ B is open in X). 
For instance, (—oo,1) is open and [—3,7] is closed. If X = (—1,7), 
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with the induced distance of R, then [0,7) is closed in X, but NOT 
in R (why?). It is not difficult to prove that a subset B is closed if 
and only if for any sequence {b,} — b, with all b, in B, one has that 
b € B (prove it!). For instance, if f : X — R is a continuous function 
defined on a metric space (X,d) and if X is a real number, then the 
set B, = {x2 € X : f(x) > A (or < A, or = A) } is closed in X. 
Indeed, let {b,} be a sequence of elements in B, which is convergent 
to an element b in X. Since f is continuous, f(bn) — f(b). Because 
bn € B, f(bn) > A for any n = 0,1,..... Then f(b) > A (otherwise, 
f(b) < A and, from a rank N on, f(bn) < A, forn > N (why?-see the 
definition of the limit f(b,) — f(b)!)), a contradiction i.e. b itself is in 
B and so B 1s a closed subset in X. 


DEFINITION 9. Let A be an open subset of R (for instance an open 
interval (c,d) ), let f : A + R be a function defined on A with values real 
numbers and let a be a fixed point in A. We say that f is differentiable 
at a if the following limit exists (and it is a real number): 


(1.1) finns OS Ta) ay f'(a) 


za La 

The limit of a function g : A — R in a limit point 6 (it is the limit 

of at least one sequence of elements from A) of A is a unique number 
1 € R such that for any nonconstant sequence {b,}, b, € A which is 
convergent to b, one has that g(b,,) — |. We shortly write limg(z) =. 


Not always a function g has a limit at a given limit point b. For instance, 
the function sign : R — {-1,0,1}, 


—1, ifx<0 
(1.2) sign(x) = Oe cif) 
ty af gO 


has the limit / = —1 at any point a < 0, has the limit / = 1 at any 
point a > 0 and at 0 it has no limit at all (prove this!). 

We recall that the limit "on the left" of a function f : A — R, 
A CR, A an open subset, at a point a of A is a number J; such that 
for any sequence {z,,}, 2, < a, which is convergent to a, one has that 
l, = lim f(#,). If we take x, "on the right" of a, we get the notion of 
the limit /, "on the right" of f at a. A function f has the limit / at a 
if and only if l; = 1, =1 (prove it!). 

It is clear enough that a continuous function f at a point a € A 
has the limit | = f(a) at a (why?). In fact, a function f : A — R is 
continuous at a point a € A if and only if it has a limit / at a and if 
that one is exactly | = f(a) (prove it!). 
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We call the number f’(a) from (1.1) the derivative of f at a. The 
linear function df(a) : R — R, df(a)(x) = f’(a) - & is called the (first) 
differential of f at a. This is simply a dilation (or a homotety) of mod- 
ulus f’(a) of the real line R. If the function f is differentiable at any 
point a of A, we say that f is differentiable (or has a derivative) on A. In 
this last case, the new function a ~ f’(a), where a runs on A, is called 
the (first) derivative of f. It is denoted by f’. We know (see any elemen- 
tary course in Calculus for the different rules in computing derivatives! ) 
that almost all the elementary functions (described above) and their 
compositions (recall the chain rule: (f o g)'(a) = f'(g(a)) - g'(a)) are 
differentiable on their definition domains. "Almost" because of some 


exceptions like f(x) = /z, f : [0, 00) — R. Since f’(x) = Tye the 


derivative of f does not exists at a = 0. Indeed, lim Re = oo! One 
xz—0, x> 


can interpret the derivative of a function f at a point a, either as "the 
velocity" of f at a or as the slope of the tangent line at a to the graphic 
of f (why?). Not all the continuous functions at a given point a are also 
differentiable at a (see Fig.3.2). But a differentiable function f at a 


given point a is continuous. Indeed, let 7, — a. lim f(en)- fa) — f'(a) 


3q tn-a 


(see Definition 9 and what follows) says that only the nondeterministic 


case 7 could give a finite number f’(a). Hence, f(z,) — f(a), ie. f is 


continuous at a. 


O x1 X2 X 
differentiable continuous but 
in x4 not differentiable 
in X2 


Fig. 3.2 
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Let C' be a set and let f : C — R be a function defined on C' with 
values in R. We say that f is bounded if its image f(C) = {f(x): 2 € 
C'} is a bounded subset in R. This means that there is a positive real 
number M > 0 such that |f(x)| < M (ie. —M < f(x) < M) for any 
x € C. Equivalently, if C C R, then f is bounded if the graphic of it 
is contained into the band bounded by the horizontal lines: y = —M 
and y = M 

A fundamental property of continuous functions is the following: 


THEOREM 32. (Weierstrass boundedness theorem) Let f : [a,b] — 


R be a continuous function defined on the closed and bounded inter- 


val |a,b]. Then f is bounded, M a sup f({a,b]) = f(c) and m = 


inf f([a,b]) = f(d), where c,d € [a,b]. This means that the least up- 
per bound (sup f(|a, |) and the greatest lower bound (inf f({a, b|) of the 
bounded set f([a,b|) are realized at c and at d respectively. 


PROOF. a) Let us prove that M = sup f([a,b]) < oo. Suppose 
on the contrary, namely that MZ = oo. Then, there is at least one 
sequence {x,,} of elements from |[a, b] such that f(a) — oo. Since {x,} 
is bounded, we can apply the Cesaro-Bolzano-Weierstrass Theorem (see 
Theorem 12) and find a subsequence {x,,,} of {x} which is convergent 
to an x. € [a,b] (here we use the fact that |a, b] is closed, how?). Since 
f is continuous, one has that f(r,,) — f(r.) when k — oo. But 
f (an) — o© and the uniqueness of the limit implies that f(x.) = co, a 
contradiction (why?). Hence f is upper bounded. In the same way we 
can prove that f is lower bounded (do it!). 

b) Let us prove now that M = f(c) for ac in [a,b]. Since M is the 
least upper bound, for any natural number n we can find an element 
Yn € [a,b] such that 


(1.3) M ~~ < f(yn) <M (why?) 


The sequence {y,} is bounded and nonconstant (why?). Applying 
again the Cesaro-Bolzano-Weierstrass Theorem, one can find a sub- 
sequence {Yn,} of {yn} which is convergent to an element c € |a,}| 
(because the interval is closed). Since f is continuous, f(Yn,) — f(c), 
when k — oo. Making k — oo in the inequality M — < f(Yn,) <M 
and using the definition of a subsequence (n; < nz < ... ), we get that 
M = fic). To prove that m = f(d), d € [a,b], we work in the same 
manner (do it!). 


THEOREM 33. (Darboux) Let f : [a,b] — R be a continuous func- 
tion defined on the closed and bounded interval |a, b]. Let M = sup f ({a, 5]) 
and let m = inf f([a,b]). Then the image of the interval |a, b] through f 
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is exactly the closed interval |m,M]. More general, a continuous func- 
tion carries intervals into intervals. 


PROooF. Let \ be an element in |m, M]. We want to find an element 
z in [a,b] such that f(z) = A. If A is equal to m or to M, we can take 
z = dorc (from Theorem 32) respectively. So, we can assume that 
A € (m,M) and that f is not a constant function (in this last case 
the statement of the theorem is obvious). We define two subsets of the 
interval |a, b: 

A; = {x € [a,8] : f(x) = A} 
and 
Ao = {x € [a,b] : f(x) < A}. 

If A, M Ag is not empty, take z in this intersection and the proof is 
finished. Suppose on the contrary, namely that A; M Ay = @. Since 
A cannot be either m or M, A; and A, are not empty (why?). Now, 
[a, b] = A,U A, (why?) and, since f is continuous, A; and A, are closed 
in R (see Remark 9). In order to obtain a contradiction, we shall prove 
that it is not possible to decompose (to write as a union, or to cover) an 
interval [a,b] into two disjoint closed and nonempty subsets. Indeed, 
let co = sup Ag. Since f is continuous, f(c2) < A (why?-remember the 
definition of the least upper bound and of the continuity!) i.e. cz € Ao. 
If co # b, then the subset 5; = {a € Ay : x > cy} is not empty (why?). 
Take now c, = inf S,. Since A, is closed, c, € A; (why?). If c¢ > co, 
take h € (c2,c1). This h € [a,b] and it cannot be either in A; or in Ag 
(why?). Since c; > co, the unique possibility for c, is to be equal to cy. 
But then, c = cy = cg € Ay Ag = @, a contradiction! Hence, cp = sup 
Ag = b. Take now dz = inf Ag. Since Ag is closed, one has that dz € Ao. 
If dj a, then the subset Sy = {x € Ay : x < dy} is not empty (why?). 
Take now d; = sup S2. Since A, is closed, d,; € A; (why?). If di < do, 
take again g € (d,, dz) and this last one cannot be either in A; or in Ap. 


Hence d; = dz "2 7d and this one must be in A, Ag, a contradiction! 
So, dz = a, ie. inf Ag = a and sup Ag = b, thus Ag = [a,b]. Since Ay; 
is not empty and it is included in |a,b], Ay C Ag, and we get again a 
new and the last contradiction! Hence A, M Ay cannot be empty and 
the proof of the theorem is over. 


We agree with the reader that the proof of this last theorem is too 
long! But,...it is so clear and so elementary! Trying to understand and 
to reproduce logically the above proof is a good exercise for strengthen 
your power of concentration and not only! 


THEOREM 34. Let I be an open interval on the real line and let 
f :I—R, be a continuous function defined on I with real values. 
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1) Assume that there are two points b and d in I (b < d) such that 
the values f(b) and f(d) are nonzero and have distinct signs. Then, 
there is a point c in the interval (b,d) at which the value of f is zero, 
i.e. f(c) =0. 2) Now suppose that at a € I the value f(a) > 0 (or 
f(a) < 0). Then there is an e-neighborhood (a — ¢,a+¢) C I, such 
that f(x) > 0 (or f(x) < 0) for any x € (a—€,a+t+6). 


PROOF. 1) We can simply apply Theorem 33. Indeed, since f(/) 
is an interval (Theorem 33), the segment generated by f(b) and f(d) 
is completely contained in f((b,d]). Since f(b) and f(d) have distinct 
signs, 0 is between them, so, 0 € f([b,d]), or 0 = f(c) for ac € |b, dl. 
2) Suppose that f(a) > 0. Let us assume contrary, i.e. for all small 
possible ¢ we can find in (a — ¢,a +) at least on number x, (an x 
which depends on ¢) such that f(x-) < 0. Take for such epsilons the 
values 


11 1 
ee a aa, 
23 n 
and find 1 € (a—+,a+ +) with f(z1) < 0,n = 1,2,.... Since 


f is continuous at a and since the sequence {x1} tends to a (why’), 
one has that f(v1) — f(a). But f(x.) are all nonpositive, so f(a) is 
nonpositive, a contradiction! Hence, there is at least one € small enough 
such that for any x in (a—¢,a+e), f(x) > 0. The case f(a) < 0 can 
be similarly manipulated (do it!). 


DEFINITION 10. Let (X,d) be a metric space and let I be an interval 
on the real line R (a subset I of R is said to be an interval if for any 
pair of numbers 71, ro € I and any real number r with ry <r < ro, 
one has that r € I). Practically, we think of a curve in X as being the 
wmage in X of an interval I through a continuous function hh: I — X. 
More exactly, we denote the couple (I,h) by a small greek letter y and 
say that y is a curve in X. If A and B are two "points" (elements) 
in X, we say that a curve y = (I,h) connects A and B if there are 
a,b € I such that A = h(a) and B = h(b). By an (closed) arc |AB| 
in X we mean the image in X of a closed interval [a,b] of R through 
a continuous function h : [a,b] — X, i.e. [A,B] = {x € X : there is 
c € [a,b] with h(c) = x}. 


EXAMPLE l. a) Let {O;i,j,k} be a Cartesian coordinate system 
in the vector space V3 of all free vectors in our 3-D space (identified 
with R?). Any point M in R? has 3 coordinates: M(x,y,z), where 


OM = vi+yj+zk, x,y,z € R. Let A(ay, a2, a3) and B(b, bz, b3) be two 
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points in R°. The usual segment [A, B] is a closed arc which connect the 
points A and B. Indeed, let h : [0,1] > R®, h(t) = (a1 + t(b1 — a1), ag + 
t(bz — a2), a3 + t(b3 — a3)), be the usual continuous parameterization of 
the segment |A, B] : 


t= air t(by = a1) 
Y= a2 + t(bg = a2) yt — (0, 1] 
z = a3 + t(b3 — as) 


Here y = ([0,1],h) is a curve in R®. This function h describes a com- 
position between the dilation of moduli b, — a1, bz — a2,b3 — a3, along 
the Ox, Oy, and Oz axes respectively, and the translation x > a+ x, 
of center a = (a4, @2, 43). 

b) Let C = {(a, y) € R*: (x—a)? +(y—b)? = 1°} be the circle with 
center at (a,b) and radius r. The parametrization of C 


x=a-+rcost 
y=b4+rsint 


,t € [0, 27] 


give rise to a curve y = ([0,27],h), where h(t) = (a+rcost,b+rsint). 
In fact, h describes the continuous deformation process of the segment 
[0,27] C R into the circle C in the metric space R*. 


DEFINITION 11. A subset A of a metric space (X,d) is said to be 
connected if any pair of two points M, and Mg of A can be connected 
by a continuous curve y = (1,h), h: 1 > X. 


COROLLARY 4. The connected subsets in R are exactly the intervals 
of R (for proof use the Darboux Theorem 38). 


For instance, A = [0, 1] U [5,8] is not connected because it is not an 
interval (4 is between 0 and 8, but it is not in Al). 


REMARK 10. A subset S of R® is said to be convex if for any pair 
of points A,B € S, the whole segment |A, B| is included in S. For 
instance, the parallelepipeds, the spheres, the ellipsoids, etc., are convex 
subsets of R°. The union between two tangent spheres is connected but 
it is not conver! (why?). It is clear that any convex subset of R? is also 
a connected subset in R? (prove it!). 


DEFINITION 12. Let f : A— R be a function defined on an open 
subset A of R with values in R. A point a of A is a local maximum 
point of f if there is an e-neighborhood of a, (a—e,a+e) C A, such that 
f(x) < f(a) for any x € (a—e,a+e). The value f(a) of f ata is called 
a local extremum (maximum) for f. A point b of A is said to be a local 
minimum point for f if there is an n-neighborhood of b, (b—n, b+n) C A, 
such that f(x) > f(b) for any x € (b—7,b+7). The value f(b) of f 
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at b is called a local extremum (minimum) for f. A local maximum 
point or a local minimum point is called a local extremum point. The 
local extrema of f on A are all the local maxima and the local minima 
of f in A. The (global) maximum of f on A is max f(A) (€ R). The 
(global) minimum of f on A is min f(A) (€ R) (see Fig.3.3). 


local 

min. not local 
extremum 
Fig. 3.3 


A critical (or stationary) point c € A for a differentiable function 
f :A—R on Aisa root of the equation f’(z) =0,ie. f’(c) =0. For 
instance, c = 2 is a stationary point for f(r) = (x — 2)?, f: RR, 
but it is not an extremum point for f (why?). The next result clarifies 
the converse situation. 


THEOREM 35. (1-D Fermat’s Theorem) Let a be a local extremum 
(local maximum or local minimum) point for a function f : A — R 
(A is open). Assume that f is differentiable at a. Then f'(a) = 0, i.e. 
a is a critical point of f. Practically, this statement says that for a 
differentiable function f we must search for local extrema between the 
critical points of f, i.e. between the solutions of the equation f'(x) = 0, 
ceEA. 


PROOF. Suppose that a is a local maximum point for f, ie. there 
is a small « > 0 such that (a —¢,a+e) C Aand f(x) < f(a) for any 
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x in (a—¢€,a+é) (if a is a local minimum point, one proceeds in the 
same way, do it!). Look now at the formula: 


(1.4) ‘pee 


ra L—-a 


= f(a)! 


If x € (a—¢,a+ 6) and x < a, since f(x) < f(a), one has that 
f'(a) > 0 (why?). Now, if « € (a—¢,a+e), but x > a, again since 
f(x) < f(a), one gets that f’(a) < 0. Both inequalities give us that 
f'(a) = 0 and the Fermat’s theorem for a function of one variable is 
proved. 


However, the Fermat’s Theorem works only at the points at which 
our function is differentiable. For instance, f(x) = |x| has at x = 0 
a local (even a global) minimum (why?), but it is not differentiable 
at this point (why?). The moral is that we must consider separately 
the points at which a function is not differentiable and see (using the 
definition only!) if these points are or not local extremum points for 
our function. 


THEOREM 36. (Rolle Theorem) Let f : [a,b] — R (a < b) be 
a continuous function. Assume that f is differentiable on the open 
subinterval (a,b) and that f(a) = f(b). Then there is at least one point 
c € (a,b) such that f’(c) = 0. 


PROOF. Let us apply the Weierstrass boundedness theorem (‘The- 
orem 32) and find m = inf f({a,b]) and M = sup f(|a, b]) as real num- 
bers. If m = M, then our function is a constant function and so, 
f'(x) = 0 for any x in (a,b). Hence we assume that m # M. So the 
number f(a) = f(b) cannot be simultaneously equal to m and M. Sup- 
pose for instance that f(a) = f(b) #4 M. Thus, ac with M = f(c), 
c € [a,b] (see the Weierstrass boundedness theorem) cannot be either 
a or b, i.e. c € (a,b). Therefore, this c is a local maximum for f. Use 
now Fermat’s Theorem and find that f’(c) = 0. 


For instance, if f(z) = «4 — 16, x € [-1,1], then f(—1) = f(1) = 
—15 and f’(x) = 0 supplies us with a unique solution c = 0. The 
continuity at the ends of the interval [a,b] is necessary, as we can see 
in the following example. Let us take 


fe)=| Oise) TEM 


This function is defined on [0, 1], it is differentiable on (0,1) and f(0) = 
f(1), but its derivative f’(x) = 1 has no zero on (0,1). 
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2. Sequences and series of functions 


We know to measure the length |la|| = \/a? + a3 + a3 of a vector 
a = aji+ dj + ask of V3, the 3-dimensional vector space of all free 
vectors (here a1, a2, a3 € R are the coordinates of a). The function a ~ 
\|al|| , which associates to a vector a its length |la||, has the following 
basic properties: 


nl. ||al| =0, if and only ifa=0, 


n2. |la + bl] < |lal| + ||bl], 
for any a,b €V3, 


(2.1) n3. ||Aal| = |A| |lal] for any A € R and a €V3. 


If instead of V3 we take any real vector space V together with a 
mapping like above, x — ||z|| € [0,00), « € V, which fulfils the analo- 
gous requirements n1, n2 and n3 from (2.1), we get the general notion 
of a normed space (V, ||.||). 


DEFINITION 13. Let V be an arbitrary real vector space and let 
f ~ ||fl| be a@ mapping which associates to any element f of V a 
nonnegative real number ||f\|. If this mapping satisfies the following 
properties: 


nsl. ||f|| =0, if and only if f =0, f € V, 
ns2. ||f + gl < fll + llgll, 
for any f,g € V and, 
ns3. ||Af|| = [Al ||f|| for any A € R and f € V, 


we say that the pair (V,||.||) 7s a normed space and the mapping 
x ~~» ||x|| (the norm of x) is called a norm application (function) or 
simply a norm on V. 


For instance, the norm of a matrix A = (a;;), 7 = 1,2,....n; 9 = 
La ecg Tbe 18 
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The mapping A ~» ||A|| satisfies the properties of a norm (prove it!) 
on the vector space of all n x m matrices. In addition, one can prove 
(not so easy!) that 


(2.2) ns4. ||AB|| < ||Al] |B] 


for any two matrices n x m and m x p respectively. 


REMARK 11. It is easy to see that a normed space (V, ||.||) is also 
a metric space with the induced distance d, where d(x,y) = ||x — y|| 
(prove this!). For instance, {r,} — x if and only if ||rn — x|| — 0 as 
nm — OO. 


If we consider now a bounded function f : A — R defined on an 
arbitrary set A with real values, we can define the norm ("length") of 
f by the formula: || f|| = sup|f(A)|, where | f(A)| = {|f(@)| : a € A} is 
the absolute value of the image of A through f, or simply the modulus 
of the image of f. This norm is also called the sup-norm. 


THEOREM 37. Let B(A) = {f : A — R, f bounded} be the vec- 
tor space of all bounded functions defined on a fixed set A. Then the 
mapping f ~> ||f|| is a norm on B(A) with the additional property: 


nA. |Ifgll < IIfllllall 


for any f,g € B(A). Moreover, any Cauchy sequence {f,} with respect 
to this norm is a convergent sequence in B(A). 


PROOF. Let us prove for instance ns2. Since 
f(a) + g(@)| < |F(@)| + |9(a)] S 
< sup{|f(a)| : a € A} + sup{|g(@)| : a € A, 
taking sup on the left side (it exists, because it is upper bounded by 
a constant quantity), we get the property n2. : || f + || < || fl + llgll. 
The property 4. can be proved in the same manner (do it!). The other 


properties are obvious (prove them with all details!). Let us prove the 
last statement. Since 


|fnt+p(@) — fn(%)| < sup{|fn+p(@) — fr(x)| 2 2 € A} = |lfntp — fall, 
for a fixed x in A, the numerical sequence { f,,(x)} is a Cauchy sequence 
in R. Since R is complete, i.e. any Cauchy sequence in R has a (unique) 
limit in R, let us associate to x the limit lim f,,(a), denoted by f(z), 


i.e. a real number which depends on x. We shall prove that this new 
function f : A — R :1) is bounded, i.e. belongs to B(A) and 2) it is 
the limit of the sequence {f,,} in B(A), relative to the sup-norm. For 
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2) let us take a small ¢ > 0 and let us find a rank N which depends on 
€ such that 


(2.3) [Reeee a Fnll <e€ 


for any n > N and for any p = 1,2,.... Since f,(x) — f(x) for any 
fixed x in A and since 


|fn+p(Z) ~~ fr(z)| = I|fntp > Fall <e 
for any n > N and any p, let us make p large enough, i.e. p — oo in 
the last inequality. We get |f(x) — f,(x)| < © (why?) for n > N and 
for any x in A. Take now sup on the left and get: 
(2.4) lf — fall Se 


for any n > N. Hence f, ua f . We make n = N in (2.4) and write 


IF(@)| < [fle) — fv(@)| + [fn @)] SIF — fll + [fll S € + [fal 


Take now sup on the left and we get: 
fll se+ llfnll, 


i.e. f is bounded and so, fy, Ie f in B(A). 


DEFINITION 14. Let {f,,} be a sequence of bounded functions on A 
and let f be another bounded function on A. We say that the sequence 
{f,} is uniformly convergent to f (write f,  f) if the sequence of 
numbers {\|fn—f\|} ts convergent to 0. If for any fixed x € A the 
sequence of numbers {f,(x)} is convergent to f(x), we say that the 
sequence of functions {f,} is simply (or pointwise) convergent to f 
(fn > f). Since |fn(x) — f(x)| < \|fn — fll, the uniform convergence 
implies the simple convergence (why?-give details!). 


The notion of uniform convergence is stronger then the notion of 
simple convergence. For instance, let 
jalt) =e, e-e0,.1): 
Here A = [0,1] and, for x € [0,1), lim f, (2) = 0 (why?). For z = 1, 
Jim fn (1) = 1. So, the pointwise lint function 6) == Os at OS ae 


1 and f(1) = 1. Hence, the sequence of functions {f,} is pointwise 
convergent to this f. Let us evaluate now 


Ifn — fl] = supt{] fn(x) — f(w)|: @ € [0, 1} = 1. 


Hence ||f, — f|| = 1 does not tend to 0! So, the sequence of functions 
is not uniformly convergent. 
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REMARK 12. (Weierstrass) Not always we must compute exactly 
the norm || fn — f ||. In fact, for the uniform convergence to f of the se- 
quence { f,}, it is sufficient to find a sequence of numbers {a,,} such that 
|fn(x) — f(x)| < an for any x € A and for anyn > N (a fixed natural 
number) such that {a,} — 0 (why?). For instance, take f,(a) = “22. 
Since for any fixed x € R, | 222 | = 4, we have that f,(x) — 0, when 
n — oo. But the right side of this last inequality is independent on x. 
So we can take a, = 4 and apply the above remark of Weierstrass. 
Hence f4(2);= aes is uniformly convergent to 0 on R. If instead of 
sinna one takes any other bounded function g(x) on an arbitrary in- 
terval I C R, we get that f,(x) = aa) is uniformly convergent to 0 on 
I (prove it!). 


In order to test the uniform convergence of a sequence of continuous 
functions we can use the following result. 


THEOREM 38. Let (X,d) be a metric space and let {f,} be a uni- 
formly convergent sequence of bounded continuous functions defined on 
X with real or complex values. Let f be the limit function of {fn}. 
Then the function f itself is a bounded and continuous function on X. 


PROOF. Recall that ||f,|| = sup |fn(X)| < oo for any n = 1,2,... 
(fn is bounded). Let ¢ > 0 be a small positive real number and let NV 
be a rank (a fixed natural number) such that 


(2.5) lf — fall < ¢ for any n > N. 


1) Let us prove that f is bounded on X. Take n = N in (2.5), 
remember the basic property of the norm function (see Theorem 37) 
and write 


FIL = GF — far) + faved SUF — fall + Ufa <€ + [fv 


Since fy is bounded (|| f|| < 00), we get that f is also bounded. 

2) In order to prove the continuity of f at a fixed point a of X, let 
us take a sequence {a;,} which is convergent to a, when k — oo. Since 
{fn} is uniformly convergent to f, there is a large number L such that 
lf — fi || < 5. Since this f;, is continuous, there is a rank K such that 
for any k > K one has 


|fr(an) — fr(a@)| < = 
Now, 
(2.6) |f (ax) — Fla@)| = Flax) — fran) + fr(ax) — F(a)| S 
< |f (ax) — fr(ax)| + [fr (ax) -— F(a)| S 
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< sup{|f(x) — fr(z)|: 2 € X} + |fr(ax) — f(@)| = 
=||f — fill + |fr(ax) — fla)| 
But, 
(2.7) |fr(ax) — f(@)| = |fr(ax) — fr(a) + fr(@) — fla)| < 
< | f(a) — fr(@)|+|fr(@) — fla)| < 5 +supt{|fi(t) —f(x)|:cEX}= 


=F +Ilfn-fll, 


for any k > K (here we just used the continuity of f;,). Combining the 
inequalities (2.6) and (2.7), we find 


F(ax) — f@)I Sf - full + 5 +Ife- fl E+ E45 =8 


for any k > K. Hence f(a) — f(a), so f is continuous at a. 


This last result is useful whenever we want to prove that a sequence 
of continuous functions {f,} is NOT uniformly convergent. Namely, 
we construct the limit function f(x) = lim f,,(x) for any fixed zx. If the 


function f(x) is not continuous, then, because of Theorem 38, we must 
conclude that {f,,} cannot be uniformly convergent to f. 

For instance, the sequence f,(a) = 2", x € [0,1] is convergent to 
f(x) = Oif w € [0,1) and f(1) = 1. Since this last function is not 
continuous, our sequence cannot be uniformly convergent to f. It is 
only simply convergent to f. 

Sometimes it is useful to integrate term by term a sequence of func- 
tions and see what happens with the limit function. 


THEOREM 39. Let {f,} be a sequence of continuous functions, 
which is uniformly convergent to a continuous (see Theorem 88) func- 
tion f on the interval |a, b|. For any fixed x € [a,b] one defines F,,(x) = 
f° frtdt, n = 0,1,... and F(x) = f” f(t)dt be the canonical primi- 
tives of f, and of f respectively on |a,b]. Then, the sequence {F;,,} is 
uniformly convergent to F on |a,b]. In particular, for x = b, we get a 
very useful relation: 

b b 
(2.8) Jim: fr(t)dt sf Jim fn(t) dt. 


PROOF. Let us evaluate 
|F, — Fl] = sup{|Fn(x) — F(x)|,2 € [a, b]} < 


< supt | alt) — f(B)|at: x € [a,8]} < 
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29) <Iifa—fllsup{ fats x € [a,8)}= 0-0) fo Fl 
Now, since {f,} is uniformly convergent to f, the numerical se- 
quence || f,, — f|| tends to zero. Hence, since 2.9 says that 
||Fn — Fl < lf — fll — @), 


we have that ||F, — F'|| — 0, i.e. {F;,} is uniformly convergent to F' on 


[a, 6). 


In the following we show how to use this result in practice. 

Let us take the sequence of functions f,(7) = nze"™ 7 [0, 1]. 
It is clear that this sequence is simply convergent to the continuous 
function f(x) = 0 for any x in [0,1]. Since f is continuous we cannot 
decide if our sequence is uniformly convergent or not, only by using 
Theorem 38. If the sequence were uniformly convergent, then, using 
the relation (2.8) we would get: 


1 1 
(2.10) lim nae” dr = | lim nxe~"" dx = 0. 
But 
1 
1 1 1 
i nee" dx = a al = ~5le" -l1)- 5 ~ 0); 


Hence, our assumption cannot be true. So, our sequence is not uni- 
formly convergent on [0, 1]. 


REMARK 13. In Theorem 39 we saw that a uniformly convergent 
sequence of continuous functions can be "termwisely" integrated. But 
what about their "termwise" derivatives? Can we "termwisely" differ- 
entiate a uniformly convergent sequence of differentiable functions? In 
general, we cannot, as the following example shows. Let fn(x) = =, 
x € [0,1]. Since ||f, —0|| = sup{= : x € [0,1]}} = = — 0, when 
n — oo, we find that {f,} is uniformly convergent to f(x) = 0 on 
[0,1]. But fi (x) = 2"! is not uniformly convergent on [0,1] as we saw 
above. 


THEOREM 40. Jf we want to differentiate "termwisely" the sequence 
{fn} of differentiable functions on |a,b], the following conditions are 
sufficient: 1) {fr} is uniformly convergent to f on [a,b], 2) {fi} is 
uniformly convergent to g on [a,b] and 3) fr € C [a,b] for any n = 
0,1,.... Then f is also differentiable and f' = g (=> f is also of class 
C? on [a, b]). 
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Proor. Indeed, using Theorem 39 for the sequence f’ “S g, one 
has that 


x 


Qi) -F(2)= Seated g(t)dt. 


Since f, “S f one has that f(x =e g(t)dt (why?). Let rp bea 
point in {a, b]. Since a g(t ie = (x — £0) ee formula), where 
Cz is a point in the segment fora; 


lim F(a) = flv) _ lim g(Cx) = g(Xo). 


L—-xXO i (a Lo xr—xXLO 


So, f’(xo) exists and it is equal to g(x). Hence, f’ = g on {a, bj. 


DEFINITION 15. Let {fn} be a sequence of functions defined on a 
subset A of R. For every n = 0,1,... we denote by 


Sn(x) = fo(a) + fila) +... + fal). 


A series of functions fr, is an "infinite" sum 
[o-e) 
dU Se 
k=0 


If the sequence of "partial sums" {Sn} is simply convergent to the func- 


tion s on A, we say that the series » fy is simply (pointwise) conver- 
=0 
gent to s (its sum) on A. If the sequence {S,} is uniformly convergent 


to s on A, we say that the series fr is uniformly convergent to s (its 


k=0 
sum) on A. In this last case, we simply write s = So fr. 
k=0 
Let the series of functions 
—_ ntl 1 
= lim +o 4a? +o. 2") = lim = ; 
bs Jim ( ) noo 1-2 1-2 


for any x € (—1,1). So, the (geometric) series $*x* is simply (point- 
k=0 

wise) convergent to =~ on (—1,1). Let us see if it is uniformly conver- 

gent on (—1, 1). For this, let us evaluate 

i ntl 1 


1-2 1-2 


IIsn — || = 


n+l n+1 


x 
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Hence, our series is not uniformly convergent on the whole interval 
(—1, 1) but,...it is uniformly convergent on every closed subinterval |a, b] 
of (—1,1). Indeed, in this case, if we denote by c = max{|a|, |b|}, we 


get 
n+1 


€ 
\|Sn — s|| < i=30 when n — 00, 


because c € (0,1). Thus the series is uniformly convergent on [a, }]. 
Sometimes, it is very difficult to evaluate "the error function" s,, — 
s. This is why we need some other tools for deciding if a series is 


uniformly convergent or not. A series of functions ys fr is said to 
be absolutely uniformly convergent if the series of the suomi of these 
functions pS |f,| is uniformly convergent. Recall that |f| (7) = = | f(x). 

It is not difficult to see that an absolutely uniformly convergent series of 


functions : f; is also uniformly convergent. Indeed, let S,, =a | fre| 
k=0 k=0 


and let S = os | fx| be the sum of the series of moduli. Then 


|s(z) — Sa(z = = |fnsi(z) + frta(a) + --| < [fasi(@)| + [fare(a)| +-. 
(why?) 


= S(x) — Sp(a) < sup{]S(x) — Sp(a)] :@ € A} = [|S — SI]. 


Hence |s(x) — s,(x)| < ||S — S,,|| for any « € A. Taking now sup on 
x € A we get that ||s, — s|| < ||S — S,,||. Since our series is absolutely 
uniformly convergent, then ||,S — S,,|| — 0, when n — oo. Using now 
the last inequality, we get that ||s, —s|| — 0, ie. the initial series 
is uniformly convergent. A powerful and useful test for the absolute 
uniform convergence is the following test. 


THEOREM 41. (Weierstrass Test for series of functions) Let A be a 
subset of real numbers and let >> f, be a series of functions defined on 


k=0 
A. Assume that ||f,|| can be upper bounded Le An € [0,00) (\fn(x)| < 
Qn where x runs on A) for any n = 0,1,... and that the numerical 
series yee is convergent. Then the series . fr ts absolutely uniformly 
k=0 k=0 

convergent. In particular, it is also uniformly convergent. 

PROOF. Let us fix a small positive real number ¢ > 0 andan z € A. 
Let 

Sn = |fol + [fl +--+ [fal 
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[o-e) 
be the n-th partial sum of the series 5> | f;|. Since the numerical series 
k=0 


>a, is convergent, there is a rank N such that 
k=0 


Anti + Ande +. FAntp <€ 


for any n > N and for any natural number p. 
Let us evaluate |S,4,(z) — S,(x)| : 


(2.12) [Snap(@) — Sn(#)| = |froa(@)| + | fnga()| +--+ [fntn(a)] S 


Ant1 + Anza +... + Antp < €. 


From (2.12) we obtain that the sequence {5,,(”)} is a Cauchy se- 
quence of real numbers (see Definition 2). Since on the real line any 
Cauchy sequence is convergent (see Theorem 13) we get that the se- 
quence {S,,(a)} is convergent to a real number S(zx) (this means that 
this real number depends on 2, i.e. it is changing if we change 2, so it 
is a function of x). Come back now in (2.12) and make p — oo. We 
find that |S(x) — S,,(x)| < ¢ for any n > N and for any zx € A. If here, 
in the last inequality, we take sup on 2, we finally get: ||S — S,,|| <e¢ 


(oe) 
for any n > N. Hence, the series 5+ |f;,| is uniformly convergent to 
k=0 


S' (its sum). Thus, our initial series 5° f, is uniformly and absolutely 
k=0 


convergent. 


(oe) 
A : t ‘ ; 
The series of functions )> arctan(ns) is absolutely uniformly conver- 
n=1 
arctan(nx) 


gent because 72 


CO 
< £.4 and the numerical series a 
n=1 


a. 
2 n2 2 n2 


8 


=>. is convergent (why?) (see the Weierstrass Test, Theorem 41). 


n 


1 

Another very useful test is the Abel-Dirichlet Test for series of func- 
tions, a generalization of the test with the same name for numerical 
series. 


THEOREM 42. (Abel-Dirichlet Test for series of functions) 

Let {an(x)}, {br(x)} be two sequences of functions defined on the 
same interval I of R. We assume that |\a,|| is a decreasing to zero 
sequence and that the partial sums s,(x) = 749 bn(x) of the series 
of functions °°, bn(x) are uniformly bounded, i.e. there is a positive 
real number M > 0 such that ||s,|| <M for any n = 1,2,.... 

Then the series of functions Sy 9 dn(«)bn(x) is (absolutely) uni- 
formly convergent on the interval I. 
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PROOF. Let us come back to the Abel-Dirichlet’s Test for numerical 
series and substitute the numbers dp, bn, §n, S, with the corresponding 
functions a,,(2), bn(x), Sn(w) and S;,(a) = S779 a(x) be (x) respectively. 
We obtain (do it step by step!) that the sequence of functions {.5;,(a) } 
is uniformly Cauchy, i.e. for any ¢ > 0, there is a rank N- such that if 
n > N- one has that 


(2.13) Snr — Sall <e 
for any p = 1,2,.... In particular, 
[Sn+p() — Sr(a)] <e 


for any fixed x in J. So, the numerical sequence {5S,,(x)} is convergent 
to a number S(x) which depend on x. Making p — oo in (2.13) we get 


|S(a) — Sn(@)| Se 
for any n > N, and for any x in J. Take now sup on « and find that 
|S — Sal] < e 


for any n > N-.. This means that {5S,,} is uniformly convergent to S, 
i.e. our series of functions )>7°_) dn(x)b,(x) is uniformly convergent on 
the interval J. With some small changes in the proof, we find that this 
last series is absolutely uniformly convergent on J (do them!). 


(yn 


Let us take the series of functions }**~_, x” for x € [—1+¢, 1], 
where 0 < € < 2. Let us apply the Abel-Dirichlet Test for series of 


functions by taking a,(r) = = and b,(x) = (—1)""1. We easily see 


n 


that ||a,(x)|| = + and that the series }>*°._,(—1)"~* has bounded partial 
Cyr gn x € [-1+ 6,1], is absolutely 


n 


sums. Hence our series }>~ , 
and uniformly convergent. 
The following question arises: can we integrate or differentiate term 


by term (termwise) a series of function )> f, ? Since everything reduces 
k=0 
to the sequence of partial sums s, = fo + fi +...+ fn, we can apply 
the results from Theorem 39 and Theorem 40 and find: 
THEOREM 43. Let 5° f, be a uniformly convergent series of contin- 


n=0 
uous functions on the interval |a, b], let s be its sum and let F,,(x) be the 
canonical primitives of f,(t) on [a,b] : F,(z) = f° fr()dt, n =0,1,... 


. Then the series of functions >> F,, is uniformly convergent on |a, bl 
n=0 
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and S(x) = [* s(t)dt, is its sum. So, 


a 


(2.14) a (2100) dt = ee fin(t)dt. 


(this means that the integration symbol [ commutes with the symbol )> 
of a series). In particular, for x = b, we get a very useful formula: 


(2.15) i (2100) dt = ay fin(t)dt. 


If in addition, f, are functions of class C' on [a,b] (fn are dif- 
ferentiable and their derivatives are continuous on |a, b], shortly write 
fn € C'[a,b]) and if the series of derivatives, u = S~ f! is uniformly 

n=0 
convergent on [a,b], then s is differentiable on |a,b] and s' = u. So, 
we can differentiate "term by term" (or termwise) the initial series of 
functions. 


In the first statement s is a continuous function on [a, b] because of 
the basic Theorem 38. In this last theorem there is a requirement: f, 
must be bounded. This is true because f, are continuous and defined 
on a bounded and closed interval (see Theorem 32). 

Let us study the following series of functions }>(—1)"2”" on (—1, 1). 


n=0 
For any fixed x, one has the formula 


1 

— l+e 
the famous geometric series with ratio —x. Hence, our series is simply 
convergent on (—1, 1). It is not uniformly convergent on (—1, 1) but it is 
absolutely and uniformly convergent on any closed subinterval [a, b] of 
(—1,1) (apply the same reason as in the case of the infinite geometrical 
series). Let us derive an interesting and useful formula from (2.16). Let 
us fix an % in (—1,1) and take a,b such that xo € [a, 6], a or b is 0 
(if zo < 0, take b = 0, if xp > 0, take a = 0) and [a, }] is included in 
(—1,1). Since all conditions in Theorem 43 are fulfilled, we integrate 
term by term formula (2.16) and get 


(2.16) l-x+2?-... ,c €(-1,1), 


x0 
| (V-t+¢? -—...4+ (-1)""4+...dt = 
0 


2 £3 pot 
=(t--+—-...4+(-1)" 
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= y(cy = iS Sip In(1 + 2). 
n g- ae 


n=1 
Now, let us put instead of xo an arbitrary x in (—1,1) and obtain 


JAG In(1 _ yy ee 
(2.17) n(1+2) = )0(-1)"'—,, for any « € (—1,1) 


n=1 


The value of the alternate series Sopa (-1)""14 is In2 but, to prove 
this, one needs the continuity of the function on the right in the formula 
2.17. And this is not so easy to be proved (see the Abel Theorem, 
Theorem 46). 

Let us compute the sum of the series of functions }°*°.)nx” on 
its maximal domain of definition. First of all, let us fix an x on the 
real line and try to find conditions for the convergence of the series 
yr nz”. Let us see where the series (numerical series this time!) is 
absolutely convergent. Applying the Ratio Test (Theorem 27) to the 
series of moduli 7°, n|2|", we get Jim “tt — |x|. We know that if 


|x| < 1, the series is absolutely eonvecccn:: in parteular it is convergent 

n (—1,1). If |x| > 1, the series is divergent, because, in this case, the 
sequence {nx"} is not bounded (why?) so, it cannot be convergent to 
0. For x = 1 or x = —1, the series is divergent. Hence, the definition 
domain of the function s(z) = S>~ na” is exactly (—1,1). Let us 
compute s(z). 


=e(ot+e+..ta"+..) =a: (+) - Tc 


Here we used Theorem 43 to differentiate term by term the series 
gtart..ta%4+.0= 75 (why the hypotheses of this theorem are 
fulfilled?). 


3. Problems 


1. Find the convergence set and the limit for the following jae 
of functions: a) f(z) = 2"; b) fr(v) = 53 ¢) fn(@) = gig, @ € (0,00); 
d) faltt) = Be, # € [0,1]; €) fale) = p28, w € [1,00 00); f) fa(x) = 


x € [1, oo). 


2 
ae ’ 
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2. Say if the convergence of the above sequences (see Problem 1.) 
is uniform or not. Study the absolute uniform convergence of the same 
sequences. 

3. Let fn(@) = qg,z2) 7 € [0,1]. Prove that {f,} is not uniformly 


convergent but ie fr(a)dx fo im O fra(x )dx 


4. Prove that f,(z) = 52, 7 re [-1 1 is uniformly convergent 
to f(x) (find it!) but f’ is not uniformly convergent to f’. Do the same 
for f,(z) = =, x € [0,1]. 

5. Prove that the series of functions }>~. ,(2” — "~t) is uniformly 
convergent on [0, 0.5], but not on (0, 1]. 

6. Is the series of functions >>> , (sin 45 
vergent on R? But on [0,1]? But on fa, b]? 

7. Prove that the following series of functions are absolutely and 


uniformly convergent on the indicated domain: a) )>*~, ar! xceR; 


co) =6(-1)"37"" co sinna 
b) op, SS, 2 € (0,00); 0) Oe, He ER d) ON, ate we 
R;e) oe Feat TER. 
8. Can we differentiate term by term the following series? 
a) Tope €=p(—na) sin na, «© € [1,00); b) Tipe Sah € 
Ce reER. 


n=1 ae ’ 


aa Sin £) uniformly con- 


A 


9. Find the image of the following functions: 

a) f(x ee x € [—3, 12); 

b) f(z) =e +2£—-5,2€ R; 

c) f(x) = 2? — 3x +2, x € [—120, 120]; 

d) f(«) = 3sin 4x, x € [—-$, 5]; 

e) f(x) = |sinx — cos2z|, x € [0, a]; 

f) f(x) = |? + 22 — 1| — 3, x € (—0w, 9]. 

ae Find the norm of the following functions: a) f(«) = 2x — 5, 
€ [- = 1p 


AST) f (ie) =" B Cosiag ae ‘|i, 60) 6) Ff (@) z 
€ [-2,2]; d) f —g , where f(x) = 32 and g(x) = 42”, =; vAR 


CHAPTER 4 


Taylor series 


1. Taylor formula 


Always the most elementary functions were considered to be poly- 
nomial functions. A polynomial function of degree n is a function 
defined on the whole real line by the formula: 


P(¢) =a aie + aon? +2 Gy2",; 


where do, Q1, .-.,@n are fixed real numbers and a, 4 0. 

Many mathematicians tried and are trying to reduce the study of 
more complicated functions to polynomials. 

It is clear enough that not all functions can be represented by a 
polynomial. For instance, the exponential function f(7) = exp(x) = e” 
cannot be represented by a polynomial P,,(x). Indeed, if 


exp(x) = ag + a2 + agar? +... + a,x” 


for x € (a,b), a £ b, we differentiate n times and find: exp(z) = nlan, 
a constant, which is not possible, because the exponential function is 
strictly increasing. Here we proved in fact that the exponential function 
cannot be represented by a polynomial in any small neighborhood of 
any point on the real line. The following problem appears in many 
applications. If x is very close to a fixed number a, i.e. if the difference 
x — a is very small (is very close to zero!), can we represent a function 
f as an "infinite" polynomial in the variable x — a? This means 


(1.1) f(x) = a9 + ay (a — a) +49 (2 — a)? +... 


in a neighborhood (a—e¢,a+e) of a. This would imply that our function 
is a function of class C'* , i.e. it has derivatives of any order. But this 
is not true for all functions. So, what can we hope is to "approximate" 
a function f in a small neighborhood of a point a with a polynomial of 
a given degree n in the variable x — a: 


(1.2) f(x) = a9 + a4 (a — a) +. a9 (a — 2)? +... 4 an (x — a)" + Ry(2), 


where R,,(x) is a remainder which is a function of x (it also depends 
on f and on a!). This remainder is the error committed when we 
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approximate f(x) by the polynomial 
ag hay (ea) tas @ =a)? Puc haxg(e a)”. 


This polynomial is called the Taylor polynomial of order n at a. 
If f(x) is a polynomial of degree n, we can represent f as in formula 
(1.2) with the remainder zero. Indeed, the set of n + 1 binomials 


{i,¢—a;(e—a)*, (#—a)*, x, (¢—a)"} 


is linear independent in the vector space P,, of all polynomials of degree 
at most n, which has dimension n + 1 over the real field (this comes 
directly from the definition of a polynomial-why?). Hence, 


fi,¢— a, («—a)?;(¢—a)*,....(¢ =a)"} 


is a basis in P,, and so, we always can uniquely find the constant ele- 
ments do, @1, dg, ..., 4, Such that 


(1.3) f(x) = ap + a1 (x — a) +. ag (4 — a)? +... + Gn(x — a)”. 


In this last case we can compute the coefficients ag, a1,...,d, by 
using the values of f and of its derivatives f’, f”,..., f( at a. Indeed, 
let us make x = a in the equality (1.3). We get f(a) = ap. If one 
differentiates the same equality and makes x = a, one obtains f’(a) = 
a,. Now, if we differentiate twice this equality (1.3), we get f"(a) = 2a2, 
and so on. Take the k-th derivative in both sides in (1.3) and find 
f(a) = kla, for any k = 1,2,...,n. Thus (1.3) becomes: 


(1.4) 
f(a) = f(a) +9 ro 


Generally, if the function f is not a polynomial of degree n, we 
formally can write (it is clear that f must be n-times differentiable): 


(c-—a)+ 


(x —a)?+...4+ a (x — a)”. 


(15) 
"a "a ; (n) a 
FGy= f(a)+ 9 (@ — a) 29 (w — a) ee) | ) p—a)"-+R,,(2), 
where 
"a "a ‘ (n) a 
Rule) = Fa) f(a) (a — a) E29 (@ — ay? OD (ea 


The problem is to estimate this remainder. The famous Taylor formula 
gives a general estimation for this remainder. 


THEOREM 44. (Taylor formula) Let A be an open subset of R and 
let f : A — R be a function defined on A with values in R, which 
is (n + 1)-times differentiable on A. Let us fix a point a in A and a 
natural number p 4 0. Then, for any x € A such that the segment |a, x] 
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is included in A, there is a point c € (a,x) with the following property: 
the remainder R,,(x) from (1.5) has a representation of the form 


(1.6) a(x) = (222) © = RON 


L-C lp 


This general form of the remainder was discovered by Schémlich. If 
p=n+l1, we find the Lagrange form of the remainder 
fPY(c) 


(1.7) Bn rd Gaon —a)"tt, 


We see that this form is very similar to the general term form in (1.5). 
In fact, it is "the next" term after the n-th term f'@ (¢ a)" in which 


n! 
the value of ft) is not computed at a, but at a close point c € [a, 2] 
(here we do not mean that a is less then x!). Usually, the error made 


by approximating f(x) with its Taylor polynomial T,,(x) of order n, 


(1.8) 
T,,(x) = f(a) + A (wa) + A (wa)? 4.4 


f(a) 


n! 


(x ~~ a)”, 


is evaluated by the Lagrange form of the remainder R,,(x). Since we 
have no supplementary information on the number c, we use the fol- 
lowing upper bounded formula: 


_ inti 

(1.9) Ante yl < oa sup{| ft) (z)| :z € [a, x]} 
Since we frequently use Taylor formula with Lagrange remainder, we 
write it here in a complete form (together with this last form of the 


reminder) 
(1.10) 
f'@ f"(a) 2 f(a) ’ 
f(x) = f(a) + rT (u—a) + (@— 4) ++ 7 (x — a) 
‘sia (2) n+l 
> Gieeaye i) (c—a)?™, 


PROOF. The proof of this theorem is not so natural. Let us assume 
that « > a. In this case, the segment |a, x] is exactly the closed interval 
[a,x]. Let us denote in (1.5) 


(1.11) Q(x) = 
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Thus, the formula (1.5) becomes: 


(1.12) 
/ W 
f(z) = f(a) + 29 (a a) 4 O9 (ea)? +. 4 
+( — a)PQ(2). 
In order to obtain a representation for Q(x), we consider an auxiliary 
function: 


(1.13) 

f! t a t fo t h 
g(t) = f(t)+ o | ) (2 —t)?...+ \ qt) +(G=t)P Q(z) 
We obtained the expression of g(t) by simply putting t instead of a, 
in (1.12). We apply now the Rolle’s Theorem (Theorem 36) on the 
interval [a,z]. The function g(t) is continuous and differentiable on 
[a,z], g(a) = f(x) (see 1.12) and g(x) = f(x) so, g(a) = g(x). Thus, 
there is a point c € (a,x) such that g'(c) = 0. Let us compute g(t) : 


g(t) = f+ oe = fit) ae f(t) (a—ty ri) 


1! 2! 
LO eg ye £00 
nN: 


(n—1)! 


f(a) 


n! 


(1 — a)" 


(a —t)+ 


(x — 1) 


(c—t)+... 


(x —t)""* — p(x —t)? “Q(a), 
So we get 


(n+1) 
a1 g(t) =F Oe 9 ple—'Q. 


Make now t = c in (1.14) and find 
fN(e) 


0=9'(e) = @ - 9" — ple — oP “Q(a). 
If here, instead of Q(x) we put “2. (see (1.11)), we get 


(x—a)P 
f*V(c) 
n! 


Rn(2) 


(ea)? so*Y(6) 
(c—c)P-! nlp 


(o— ay fN(o) 
(c—c)P nip 


R,(x) = 


(x == o = en, 


(x — 


i.e. formula (1.6). The other statements of the theorem are easily 
deduced from this last formula. 


REMARK 14. A function f(x) is a zero of another function g(x) 


at a point a if lim 43 = 0. We write this as f(x) = O(g(x)) at a. 
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For instance, from (1.7) we see that the remainder R,(x) is a zero of 
OO) ab G0, 18: Aye) = 0((e— ay") ake =a. 


If a = 0, the formula (1.5) is called the Mac Laurin formula: 


_ £9) 


(15) fle) = 0) + $n + 2 


ay 


get... 4 x” + R,,(2x) 
If we use the Lagrange form of the remainder (1.7), we get 


_ TO) of 0) pte) a ere), wel 
(1.16) f(x) = f(0)4 x4 xt+...4 x” 4 Gal : 


1! 2! n! 
where c is a real number between 0 and 2. Since it is easier to ma- 
nipulate Mac Laurin formulas for many functions which are defined on 
an interval (a,b) with 0 € (a,b) and since the translation x — x — a 
makes connections between Taylor formulas and Mac Laurin formu- 
las, we prefer to deduce these last formulas for the basic elementary 
functions. 


EXAMPLE 2. (exp(xz)) Let f(x) = exp(x) = e*,x € R. Since the 
derivatives of exp(x) is exp(z) itself, the Taylor formula at a = 0 (Mac 
Laurin formula) for exp(x) becomes 


r ig gn ntl 


(1.17) exp(z) = 14 7 +... 4 al + exp(c) ———— 


where c € (0,2), ifx >0, orcé€ (2,0), ifa <0. 


For instance, let us compute exp(0.03) with 2 exact decimals. Since 
c € (0,0.03), this means that 


(0.03)"+1 (0.03)"* 1 
R,,(0.03)| = Ae PE eee, 
We 08)| =| SRP) Ca) (n+1)! ~ 100 
or 
si <3 < 100"(n + 1)! 
nr ae 
100"1(n + 1! ~ 100 


It is easy to prove this last inequality by mathematical induction for 
n > 1. So, exp(x) = 1+ 283 = 1.03, with 2 exact decimals. This is the 
method which computers use to (approximately) calculate exp(r) for a 
given real number r. Formula (1.17) can also be written as 


(1.18) exp(z) = 14 +... 4 + O(a") 
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We can use this formula to compute nondeterministic limits. For in- 
stance, let us compute 


fo) 


_ exp(z3) -1-—2°- 4 
lim 


s—0exp(x?) — 1 — x? — 


ox 
gt 
2 
In formula (1.18) we put instead of x, 7° and n = 2: 


6 
exp(z®) =1+23 + — +0(z°). 


2 
If we put now in (1.18) instead of x, x? and n = 3, we get 
4 6 
2 Bai v 6 
=14 + 0 : 
exp(z*) +a te + Ole’) 
Hence, our limit becomes 
76 
0(x°) oe) l ms ) 0 
1 => cu = sas = => 0 
20 2 + O(x°) 707 | oe”) 1 lim ea L + 0 


In practice, we do not know in advance how many terms we must con- 
sider in numerator and in denominator such that the nondeterministic 
to be eliminated. So, it is a good idea to consider one or two terms 
more than the degree of the polynomial queue which induces the non- 
deterministic. In our example we write 


x , «®& | 2 | x 
=. F (hora ot ara as fey = 
lim | x2 | x4 | x6 | x8 | 2 x4 
a0 ice ab ge tay Vay ey ee 
9 3 
oe a + 0 
= lm=+, = a; =7, =0 
ree ir apc Boal a SE 31 
EXAMPLE 3. (sin(x)) Let f(x) = sin(x), x € R. Since [sin(x)]’ = 
cos(z), [sin(x)]” = —sin(z), [sin(x)]/” = —cos(x) and [sin(x)] = 
: ; ; (4k-+1) (4k42) 
sin(x), we obtain that |sin(«)| = cos(x), [sin(x)| = —sin(z), 
[sin(x)]“**®) = —cos(x) and [sin(x)|“” = sin(x) for any k = 0,1,... - 
Now, sin0 = 0, cos0 = 1 and, applying formula (1.16), we get 
Bo SMe.) ae ee Qn+1 
(1.19) — sin(x) = if a ee (—1) n+) * Ola); 


It is more complicated to express the remainder in this case because the 
(n + 1)-derivative of sin(x) is either +sin(x) or +cos(x). Let us use 
the Mac Laurin formula for sin(x) in order to compute sin(0.2) with 
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one exact decimal. Here 0.2 means 0.2 radians. Now, the modulus of 
the remainder, |Ron41(x)| is less or equal to Gna jx[?"*? . So, 


1 
2n+2 
1 
Ron+1(0.2)| << — —_— 


and this last one must be less then i 1.€. 
1 


(2n + 2)! 


(Oe Ores 


g2nte2 < 192211 


or 
gent? < (Qn + 2)1102"*?, 


But this last one is true for any n > 0. Hence, sin(0.2) ~ 0.2 with one 
exact decimal. 


EXAMPLE 4. (cos(x)) Let f(x) = cos(x),  € R. Like in Example 
3 we easily deduce the following formula 


fia at 76 gn 


(1.20)  cos(a) =1— La (ae ieee (a 


+ O(x?"), 


EXAMPLE 5. Let 
f(z) = In(1 + £), x = (1, 00). 
Since 


fiz) =(1 +2)" f"@) =—-A+2)%, f'"(x) = 20 +2)™.... 


af @G)= (Al a= DI ay x. 
one has that f(0) = 0, f’(0) =1, f”(0) = -1, f’"(0) = 2,..., f™(0) = 


(—1)""!(n —1)!,... . So, the formula (1.16) becomes 
(1.21) 

i a x" (l1+c)-" 
Ini) Sp as ey ae age 
n(lt2) =0- S$ S$. (1 + (1a, 
where c is a real number between 0 and x. Hence, 

Eo eo ae a 

1.22) In(1 =g—-—+2-——4.,,.+(-1)*1'— +4 0(z"). 
(122) Md +e)=e- P45 -F+..4 (1S + 0(0") 


Let us compute In(1.02) with 3 exact decimals. Since 


0.02)? 0.02)8 
In(1.02) = In(1 + 0.02) = 0.02 — ( 5 ) ( 3 ) b ... 
(l+c)-"1 
a ps an eee 
n © ( ) n+1 


O02) 
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where c is between 0 and 0.02, we must evaluate the modulus of the 
remainder and force this last upper bound to be less then 
0.0247) < 


—l < : 
ce) n+1 (n + 1)100"+1! 1000 


This last inequality is true for any n > 1. Thus, In(1.02) ~ 0.020 with 
3 exact decimals. Pay attention! It is not sure that 020 are the first 
three decimals of In(1.02)! What is sure is that |In(1.02) — 0.02] is less 
then 0.001 = (this means "with 3 exact decimals!"). 


1 
1000? 
. (1 ahs aa gnt+l 1 


a0 

EXAMPLE 6. (Binomial formula) Let f(x) = (1+ 2)*, where a is 
a fixed real number and x > —1. Since 

f'(z) =a(1 +2)", f"(z) = a(a-1)(14+2)%”.... 
wy f(t) = a(a — 1)(a — 2)...(a-—n + 1)(1 4+ 2)*™,..., 
one has that 
fO=LFO=a,f'0)=ela- 1)... 
4 f™(0) = a(a — 1)(a — 2)...(a —n+1),.... 

Now, formula (1.16) becomes 


Ot Se. 1 OL OE ae 
(l+2)°=14 ne? a te 
Ors DL) Ore See ge 
= n! ae 
(1.23) | a(a— EN 22 Vial r= a GE a 
(n+ 1)! 
where c 1s a real number between 0 and x. 
Formula (1.23) can also be written as 
Ye. eg a _ hed), | 
—1 — 2)..(a— 1 
(a= 1(a=2).(4= 2+ Don 4 car 
n! 


Let us use this formula to approximate the following expression 
E = E(q) = a,b > 0, by a polynomial of degree 2 (it is used 


y/atbq2’ 
in Physics for gq small). In order to apply (1.23) we need to put our 
expression in the form (1 + x)*. So, 


b 
E=(a+ bq?) 2 = a2 (1 + =q?)>2: 
a 
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Let us take only (1 + b@?)-3 and use (1.23) up to x”, where x = 2¢? 


and a = —i. We get 


Hence, 
i a ae 
Jatbge Va Bafa’ Bafa" ’ 
If a = n, a natural number, we obtain the famous binomial formula 
of Newton: 


“ n n(n — 1) n(n —1)(n — 2)...1_,, 
(1.25) (l+a)"=14 eta a = oo 
because the remainder in (1.23) is zero. If instead of x we put 
(1.25) we get 


sere (ie Gar Gar Qe 


Multiplying by a”, we get: 
(1.26) 


(a+b)" =a"+ (") ar tbs (5) any = (;) a"-3p3 4 4 (") br 
1 2 3 rs 


Here, (") — SS = CED means n objects taken k:. 


in 


EXAMPLE 7. The equilibrium position of a homogeneous weighted 
string, fixed at the ends, has a form given by the plane curve y = 
a-ch(}), where ch(x) = op(a)rexp(—2) and a,b are real numbers. The 
function f(x) = ch(x) is called the hyperbolic cosine of «x. 

The derivative of the function ch(x) is sh(x) = SS called 
the hyperbolic sine of x. Since the derivative of each of them is the other 
one, we easily get the formulas 


r 7 7 gent 


bo = 4 Nl ere Nl eure 

C20 eee | (2n +1)! Dee): 
Pia at 7 yn 

Ae. h(x) =14 ! ted. a aio 8 

(1.28) els) a ' a 6 (2n)! ee 


For instance, for x small enough, we can approximate ch(x) by the 


polynomial Ty(a) = 1+ “ + = For x = 0.5, ch(0.5) © 1+ 222 + 2:0028 


Taylor’s and Mac Laurin’s formulas have many applications in the 
local study of a function (or a curve). 
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COROLLARY 5. (Lagrange formula) Let us write Taylor formula 
(1.10) forn =0: f(x) = f(a) + f'(c)- (a@ — a), where c is a number 
between a and x. Ifx =b> a, we get the classical Lagrange formula: 


f(b) = f(a) + f'(c)- (ba), where c € (a, )). 


REMARK 15. We can use Taylor formula (1.10) for study the shape 
of a function in a neighborhood of a point a. Suppose that 


LO @ =.= 7° a= 0 
and f(a) # 0. We also assume that f is of class C” on an e- 
neighborhood (a —¢,a+¢) of a. Then 


_ £%) 


n! 


(1.29) f(@) — fla) 


where c is between a and x. It is clear that the continuity of f(x) ata 
implies that the sign of this last function on maybe a smaller subinterval 
(a —6,a+ 4) of (a—¢,a+ 6) is constant and it is the same like the 
sign of f\™(a) (see Theorem 34). Suppose that f(x) > 0 for any x € 
(a —6,a+ 0). Then, in (1.29), c € (a—6,a+ 6) and so, the sign of 
the difference f(x) — f(a) depends exclusively on n and on the sign of 
f(a). If n is even, and f(a) > 0, the difference f(x) — f(a) is > 0, 
for any x € (a—6,a+4+ 5), thus a is a local minimum point for f. If 
n is even, but f(a) <0, then the difference f(x) — f(a) is < 0, for 
any x € (a—d,a+0), soa is a local maximum point for f. If n is 
odd, the point a is not an extremum point because the sign of (x — a)" 
changes (it is positive if x > a and negative otherwise). For instance, 
f(x) = (a — 2)° has not an extremum at x = 2. 


(x 7a a)”, 


Let A be an open subset of R and let f : A — R be a function 
of class C' on A. This means that f is differentiable on A and its 
derivative f’ is continuous on A. One also says that f is smooth on A. 
We say that f is convex at the point a of A if the graphic of f is above 
the tangent line of this graphic at a, on a small open e-neighborhood 
U of a which is contained in A. If here we substitute the word "above" 
with the word "under", we get the definition of a concave function f 
at a point a. Since the equation of the tangent line of the graphic of 
the function f at a is: 


Y = f(a) + fi(a)(X — a), 
f is a convex function at a if and only if 
(1.30) f(x) 2 fla) + fi(a)(a — a), 
for any x in U = (a—e,a+e) CA. 
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CorROLLARY 6. Let the above f be a function of class C? on U = 
(a—e,a+e). We assume that f"(a) 4 0. Then f is convex ata if and 


only if f(a) > 0. 


PROOF. Let x be a point in U and let us write the Taylor formula 
(1.10) for n = 1 at a on the segment |[a, 2] : 


f'(a) f" (ca) 2 
(1.31) fla) = fla) + Sp (e—a) + lea)", 


where c, € [a,x]. If f is convex at a, then there is a small interval 
U’ = (a—«',a+e’) CU such that (1.30) works on U’. Hence, for any 
x in U' one has that f”(c,) > 0 in (1.31). Since f” is continuous on U 
(see the fact that f is of class C? on U!) and since c, — a whenever 
x — a, one fas that f”(a) > 0. But we just assumed that f”(a) 4 0, 
so f’(a) > 0. Conversely, if f’(a) > 0, then f”(2) > 0 on a whole 
neighborhood U” = (a—eé",a+e") C U. Thus f"(c,) > 0 in (1.31) 
for any x in U”. So, (1.30) works on this U”. Therefore f is convex at 
a. 


We leave the reader to state and to prove a similar result for a 
concave function f at a. 


2. Taylor series 


Let us consider a function f of class C® on an open subset A of 
R. This means that f has derivatives of any arbitrary order on A. It 
is clear that all of these derivatives are continuous on A. Look at the 
formula (1.10) and push the remainder to oo. We obtain the series of 
functions on the right side: 


f(a) 
1! 


f(a) 


n! 


f"(a) 


51 (pa) eet 


(2.1) f(a)+ (s—a)"+... 


(vx —a)+ 


This series of functions is called the Taylor series associated to 
the function f at the point a. If this series of functions is uniformly 
convergent and its sum is f(x), we say that 


© en) (q 
(2.2) fe =V Ee — ay 


is the Taylor’s expansion of f around the point a. If the series on the 
right side is simple convergent and its sum is f on an e-neighborhood 
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of a, we say that f is analytic at a. If f is analytic at any point of 
A we say that f is analytic on A. The series on the right in (2.2) is 
a particular case of a more general type of series of functions, namely, 
the power series. A power series is a series of functions of the form 
eg Gn(x — a)”, where {a,,} is a sequence of real numbers and a is a 
fixed arbitrary number. 


THEOREM 45. Let f : (c,d) — R be an indefinite differentiable 
function on an interval (c,d) (f € C™(c,d)) such that there is a positive 
real number M which verifies | fo (x)| <M for any x € (c,d) and for 
any n = 0,1,... (we say that all the derivatives of f are uniformly 
bounded on (c,d)). Then the series ~~~ , ale £"@(¢ — a)” is absolutely 


ml 
and uniformly convergent on (c,d) for any fixed a in (c,d). Moreover, 


for any fixed a in (c,d). The series on the right is absolutely uniformly 
convergent to f. 


PROOF. Let us denote L = d —c, the length of the interval (c, d). 
We apply the Weierstrass Test (Theorem 41): 


(n)(q 


n! 
and the numerical series )>°°_, “LZ” is convergent (use the Ratio Test: 
anti — —L _, (. < 1). Hence, the series >> , LO (a — a)" is ab- 
solutely and uniformly convergent. Let 


=> 2 (x —a)*. 


M 
< ae for any x € (c,d), 
n! 


Formula (1.10) gives us: 


f° (c) a M 
ee — jo SAV (pg) tlc pnt. 
He atol= liegt "|S Te 
Taking sup we obtain || f — s,)|| < (L"*! and, since (Ltt _, 0 


Caan man 
as n — oo (prove it by using a numerical series!), we get that {s,,} is 
uniformly convergent to f. In particular 
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EXAMPLE 8. (Taylor series for the basic elementary functions) 

a) We know that 

2 gn gntl 

Since all the derivatives of exp(x) are uniformly bounded on any bounded 
interval (a,b) (why?) we can apply Theorem 45 and find that the se- 
res ys 1” is absolutely and uniformly convergent on any bounded 


nt 


interval (a,b). In particular, we have the Taylor expansion 


ye ge “1, 
b) We leave the reader to deduce the following Taylor expansions: 
a ee: ont 
2.4 i =—— | —... + (—1)"——_ + ... 
ey a Wea ay 


| 
< (2n + 1) 
a oe en 
(2.5) cos(z) = 14 st al 6 + (—1) (On) +.. 
(oe) ae n 
= (FV) r".ceR 
<— (2n)! 


Since all the derivatives of sinx and cosx are uniformly (indepen- 
dent of x) bounded (by 1) on R, the series on the right side in the last 
two formulas are absolutely and uniformly convergent on any bounded 
interval of R (why not on the whole R?). 


c) 


| he vic x” 
Dx In(1 a = es he eee eae 
@Q6) mits)ee-F 45 -F 44 (te + 
2. Syn 
= (=) x”,x € (-1,1) 
n 


Since the n-th derivative of f(x) =In(1+ 2) is 
fO(@) = (“1 '(n- 1+ a2)™ 
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it is not uniformly bounded on the whole interval (—1,1) (why? 
because sup(1+2)~" = 00 there!). Even on any other small subinterval 
[a,b] of (—1,1) the derivatives of In(1+ 2x) are not uniformly bounded 
(because of n, this time!). Hence, we cannot apply the above Theorem 
45. Let us look directly to the absolute value of the remainder in (1.21) 
when x € (—1,1): 


nite att 
we 1 


(0 | 
where c belongs to the segment [0,x]*, i.e. c € [0,2], or [x,0] (for 
x <0). It is clear that if x — —1, c may become closer and closer to 
—1 and the remainder cannot uniformly go to 0. But, if we take any 
subinterval |a,b] of (—1,1), then 


1 —n-1 
(-1)r4 + c) git 
n+1 


1 Mn 


su . , 
- ~n+l G+mrH 


x€[a,b] 


where M = max{|a|,|b|} and m = min{|a],|b|}. Thus, in this last 
case, 


1 —n-1 
||In(1 + 2) — s,|| = sup (1b tt < 
x€[a,b] ned 
n+l 
1 [M 2h 
PL ei 
because 7“ < 1. So, {s,(x)} is uniformly convergent to In(1 + 2x), 
relative to x, on [a,b] C (1,1). 
d) 
2 a a(a — 1) 
(l+2)*=14 4 5 GP bs. 
Ole Die 2) hoe Mg. 
2 oI xv Tr Sere 
or 
Goa de = a(a — 1)(@ — 2)...(a—n+1) , 
(2.7) (lta)? =1+5 x a”, @ € (=1, 1): 


n=1 


For the series on the right side we shall prove later (Ch.5, Abel Theo- 
rem, Theorem 46) that this one is absolutely and uniformly convergent 
on any closed subinterval [a,b| of (—1,1). We leave the reader to try 
a direct proof for this last statement. For a fixed x in (—1,1) the se- 
ries in (2.7) is convergent (apply the Ratio Test). Thus, the series of 
functions is simple convergent on (—1,1). 
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3. Problems 


1. Find the Mac Laurin expansion for the following functions. In- 
dicate the convergence (or uniformly convergence) domain for each of 
them. 

a) f(x) = Z(exp(x) + exp(—2x) + 2cosx); Hint: Use formula (2.3) 
for exp(x) and for exp(—x) (put —2z instead x!) and formula (2.5) for 
cos(2). 

b) f(x) = $arctan(x) + 4 In ; Hint: Compute 


ge? 


! 1 2 4 
(arctan(x)) = me Loo 
and then integrate term by term; write then 


l+a 


In 


= In(1+ 2) —In(1—- 2) 


and use formula (2.6) twice. 
c) f(z) = a- arctan(x) —InV1+2?; Hint: Write 


1 1 4 6 
In VI-+a? = 5In(1 +2”) = 5 (0? — 5 bas ) 
d) f(z) = IS aeES Hint: Write > ree = — ae then, for 
instance 
1 _ 1 1 _ 1 1 XL ia ; gy 
a a 21-2 9 Ey gg PN Zag Ser 4 


e) f(e) = 9529 f) f(z) = n(2—-32+27); Hint: In(2—32+27) = 
In(1 — x) + In(2 — x) and 


x x x? x 
In@— 2) =In2-+In( - 3) =n2- (5 2.5 53.3 ae 


g) f(x) = xexp(—2z); Hint: in formula (2.3) put instead of x, —2z, 


h) f(x) = sin(3x) + xcos(3x); i) f(x) = arcsinz; Hint: Compute 
= (1—2?)~2 and use the formula (2.7) with —2? instead of x and 


1 
a 
j) f(z) = sin’ x; Hint: Write sin?’x = 3sinx — }sin 3x and use 
formula (2.4) twice. 
2. Write as a series of the form }°~,a,(a + 3)” the following 
functions (say where this representation is possible): 
a) f(x) = sin(3z +2); Hint: Denote x + 3 = z (a new variable) and 
write f(x) as a new function of z : 


g(z) = sin(3(z — 3) + 2) = sin(3z — 7) = [sin 3z] cos 7 — [cos3z]sin7 = 
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= [cos 7| (s: = oe fe ) — [sin 7] (: a oe ae =) ; 


now, come back to f(x) by the substitution z = x + 3, etc. 
b) f(z) = V(3 + 2x); c) f(z) = In(5 — 4x); d) f(a) = exp(2x + 5); 


e) f(c) = aes; £) fe) = =e. 
3. Using Mac Laurin formulas, compute the following limits: 


4) lim SP nee) ib) ae eames c)lim Y148a-a-1 , 
20 x3 Uae San | x 1 7.9 1-4e—exp(—42)? 
2 
.  cosx—exp(— 5 
d) lim SPC). 
x—0 2 
e) lim [x — x?In (1+ +)]; Hint: Write y = +; now, x — oo if and 
ate zl) @) ’ 


only if y > 0 and y — 0; our limit becomes 


ee is ee re) ae mane a 
bg [5 pnc+o| int -ae-5+5-~)]- 
1 sy 1 
5 ivy ee axa | 
in | +..| 5 


4. Using Taylor formula approximately compute: a)V1.07 with 2 
exact decimal digits; b)exp(0.25) with 3 exact decimals; c)In(1.2) with 


3 exact decimals; d)sin 1° with 5 exact decimals; Hint: 1° = 75 radians; 
so, 
«i e.g a8 antl 
sin a ler ae —... + (-1)"—_., 
130° 1! 3! SI (2n +1)! 
where « = 7g, and n is chosen such that |Ron+41(x)|, which is less then 


a"? to be less than jt. So, we force 


1 
(2n+2)! 


1 ( 1 ‘as 1 
(2n + 2)! \180 105 


and find such a n. 


CHAPTER 5 


Power series 


1. Power series on the real line 


We saw that Mac Laurin series are special cases of some particular 
series of functions }*° , a,7", where {a,,} is a fixed numerical sequence. 
If one translates x into x—a, where a is a fixed real number, we obtain a 
more general series of functions, )7°_) dn(a—a)". These ones are called 
power series (with centre at a) on the real line. If we put y = x —a in 
this last series, we get °° 9 any”, i.e. a power series with centre at 0, 
but in the variable y. Such translations reduce the study of a general 
power series }7°°_) dn(a — a)” to a power series 7°) a,x” with centre 
at 0. The mapping x — >, a,x” give rise to a function S(z) = 
eg ant”. The maximal definition domain M,. = {x € R: 0. 9 dna” 
is convergent} of this function S' is called the convergence set of the 
series. At least x = 0 is an element of M, (.S(0) = ao). Sometimes M, 
reduces to the number 0. For instance, S(xz) = }>*°_) nlx” is convergent 
only at 0. Indeed, let us consider the series 7°, n! |x|"of moduli and 
apply the Ratio Test: in = Tim (n+ 1) |2| = 00, except x = 0. In 


fact, if ¢ £4 0, {n!x"} does not tend to 0 (why?). Sometimes M, = R, 
as in the case of the series S(r) = }>*., 4a" = exp(z). 

In the following, we want to describe the general form of the conver- 
gence set of a power series }>~* , a,x”. Since the convergence set is the 
same if we get out a finite number of terms, we can assume that a, 4 0 
for any n = 0,1,.... If for an infinite number of n the term a, is 0, 
we can define the following number R by using the Cauchy-Hadamard 
formula (see Remark (16)). Thus, finally, we can suppose that a, 4 0 
for any n = 0,1,.... The number 


1 
R= 


a 


} 


in [0, co] (ie. R can be also oo) is called the convergence radius of the 
series )>° , anv”. Recall that lim sup{z,,} is obtained in the following 
way. Take all the convergent subsequences (include the unbounded and 
increasing subsequences, i.e. subsequences which are "convergent" to 


lim sup{|=+ 
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oo in R) of the sequence {x,} and the greatest of all these limits of 
them is called lim sup{z,,}, the superior limit of the sequence {2,, }. 


THEOREM 46. (Abel Theorem) Let \>~ anv" be a power series 
with real coefficients ap, a1, ..-,4n,... and let R= imap =] in [0, co] 
be its convergence radius. . 

i) If R#0, then the series S is absolutely convergent on the inter- 
val (—R, R) and absolutely uniformly convergent on any closed interval 
[—r,r], where 0 < r < R. Moreover, the series is absolutely and uni- 
formly convergent on any closed subinterval |a, b] of (—R, R). IfR Ao, 
the series S' is divergent on (—oo, —R) U(R, ov), 80, 


(—R, R) Cc M.c [-R, RI, 


i.e. the convergence set of the series contains the open interval (—R, R), 
it is contained in [—R, R] and at x = —R, or at x = R we must decide 
in each particular case if the series 1s convergent or not. 

ii) If R = 0, then the series S is convergent only at x = 0, i.e 
Mie A0F. 

iii) If R A 0, then the function S : (—R, R) > R is of class C™ 

n (—R,R), S'(x) = 3°, na,x"" (termwise differentiation) and a 

primitive of S on (—R,R) is U(x) = Yoo Se"? (term by term 
integration). All these power series U, S, S', 8", S",...,5(™,... and 
any other power series obtained from them by a termwise integration 
or differentiation process have the same convergence radius. Moreover, 
if the series \o~ ,a,z" is convergent at x = R, for instance, then 
the function S : (—R,R] — R, defined by S(x) = 3 anu” ofc F 
R and S(R) = yank” is continuous on (—R, RI. With this last 
hypotheses fulfiled, we also have that the series ) >, d,%" is absolutely 
and uniformly convergent on each closed subinterval of the type |-R+ 
é,R|, where e > 0 is a small (« < 2R) positive real number. The 
same is true if we put —R instead of R and if the numerical series 
S(—R) = oP. 9 @n(—R)” is convergent. 

PROOF. The last statement will not be proved here. An elegant 
proof can be found in [Pal], Theorem 2.4.6. 

i) Let us consider x as a fixed parameter (for the moment) and let 
us apply the Ratio Test to the series of moduli }°~, |a,| |z|". Let L 
be the limit 


n+1 
i = inane Beal = fimsup { Seth Ee ne 


If R = ow, then L = 0 < 1, so the series is absolutely convergent for 
any « € R. If R = 0, then L = ow, except maybe the case when 
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x = 0. Hence, if R = 0, the series is convergent ONLY for x = 0, i.e. 
the statement of ii). Suppose now that R # 0,00. Then, whenever 
de I < 1, or x € (—R,R), the series is absolutely convergent, in 
particular convergent (see Theorem 31). If x € (—oo, —R)U(R, co), or 
|z| > R, then L > 1. Hence, 


n+1 
lim sup {ete sd, 


[an [ar|” 


an, g|"k +1 
This means that there is at least one subsequence Set} of 


| an, |lal”* 
lnsalel™* Langella 
lanysal [2["*** > Jang fal" 
for any k = 0,1,.... Thus the sequence {a,7”"} cannot tend to 0 


and so, the series )>°_, @n2" cannot be convergent for such an z. Let 
now x € [—r,r], where 0 < r < R. Since for c = r < R, the series 
rg |@n| 7” is convergent (r € (—R,R), so the series 7) anz” is 
absolutely convergent, see i)). But, |a,7”| < |a,|r" for any n = 0,1,... 
implies that the series )77°_) a,x” is absolutely and uniformly conver- 
gent (we apply here the Weierstrass Test Theorem 41) on [—r,r]. Since 
any interval [a,b] C (—R, R) can be embedded in a symmetrical inter- 
val of the form [—r,r] C (—R, R), we obtain that the series 7°) nz” 
is absolutely and uniformly convergent on ANY closed subinterval |a, }] 
of (—R, R). 

iii) It is easy to see that all the power series U, $’, S”,... have the 
same convergent radius R as the series S. Applying the Weierstrass test 
to each of them on an interval of the form [—r,r] C (—R, R) and the 
theorems 39 and 40, we can prove easily the first statement of iii). 


Let us consider the power series 


lo) om ae | 


n=1 


We know that this one is identical with In(1 + x) on (—1,1). Let us 
find the convergence set M, of it. The convergence radius is equal to 


1 1 


R = 
i 


i 
n+l 
i 
n 


lim sup{|™ 


} — limsup{ 


An4+1 
n 
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At x = —1, the series becomes 
Si 
pa 
so the series is divergent at « = —1. Now, S(1) = °~, as ig 


the alternate series, which was proved to be convergent. Since both 
functions S(a) and In(1+.) are continuous at x = 1 (prove it!-by using 
iii) of the Abel Theorem), one has that S(1) = In 2. From Abel Theorem 
we see that M, is exactly (—1, 1]. On this interval it is n(1 + x) but, 
the series does not exist outside of (—1, 1], while the function In(1 + z) 
does exist, for instance at x = 2! 

Let us now look at the binomial series 


al 3 a(a — 1)(a — 2)..(a-—n+ 1) 


n! 


n=1 
where a is a fixed real parameter. Let us find the convergence radius 
of this series: 
1 = 
(1.1) R= Sn 
lim sup{| + mach ears 


If 2 = —1, the series is not convergent for any a. For instance, if a = 
—1, then $7" ,(-1)"(-1)” = oo. At « = 1, 07, (-1)” is divergent. 
If @ is a natural number k, then the series becomes a polynomial, 
so its convergence set is the whole R. But,...the formula (1.1) and 
Abel Theorem say that... M. = R C[-1,1] !!! Somewhere must be 
\ is 
nondeterministic, so the computation of R in (1.1) is wrong! We see 
that the convergence set /.(a) of the binomial series strongly depends 
on a. We do not give here a complete discussion of M.(a) as a function 
of a. 

Let us find the convergence set for the following series of functions 


ei 3 = Ge :) 


n=1 


Qan4+1 
an 


a mistake! Indeed, since az+1 Ak+2 (as 0, lim sup 


This is not a power series ye making the substitution y = we 


obtain a power series )°>~_ cy 
gence radius of this last series is 

1 a 
+1)} jim 


noo (n Hp 


oe 
2x1? 
y” in the new variable y. The conver- 


RS 


Anti 


lim sup{|— 
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For y = +1, the series is convergent (why?). So, the convergence set 
M.y for the power series 


ay 


n=1 
is M. = |—1, 1]. Coming back to the variable x, we get that the initial 
series of functions 
= eer 
n? (55 + :) 


n=1 


is convergent if and only of —1 < sat a Os ee 2 


ee (=00, = 1) Ui[0, 00). 


Hence, the set of all x in R such that the series 


ee 


n=1 


is convergent, i.e. the convergence set of this last series, is 
(—oo, —1] U [0, 00). 


REMARK 16. (Cauchy-Hadamard) Another useful formula for com- 
puting the convergence radius R of a power series ~~ a,x" is the 
following Cauchy-Hadamard formula: 


1 

oo 2 lim sup ¢~/|a,,| 

This formula can be used even when an infinite number of an are zero. 
The proof of Abel’s Theorem by using this formula for R is completely 
analogue to the proof of the same theorem given above. In this case one 
must use the Root Test (Theorem 29) instead of the Ratio Test as we 
did in proving Abel Theorem. If we start with the definition of R as it 
appears in formula Cauchy-Hadamard (1.2), we get the same interval 
of convergence (—R,R) for our series yy ,anx" (why?). Thus, the 
both formulas give rise to one and the same number. 


Let us find the convergence set and the sum of the series of functions 


1 
2n+1 


n=0 


(oe) 


(32 + Dee: 


This one is not a power series but,...we can associate to it a power 
series by the following substitution y = 37 + 2. Hence, we must study 
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the power series in y : 


ies yrt 
— 2n+ ie 
Here don41 = Seed and a2, = 0 for any n = 0,1,... . In our case, it is 
not a good idea to apply Abel formula R = ip aezzal D (why?). Let 
us apply Cachy-Hadamard formula (1.2): : 

1 


1 
~ limsup X/|an| = 
because the sequence { ~/|a,,|} is the union between two convergent 
subsequences: 


{?*/|aoneil = {4 pot 


wad 


{{ 4/laan]} = {0} — 0 


and so, limsup ~/|a,,| = 1. At y = —1 the series 


(why?) and 


(oe) 


1 iT 
Des y? +1 
eee n+1 


becomes 
[o-e) 


1 
Diag 


(why?). At y = 1 the series is 


Yi 


Hence, the convergence set for the power series in y is (—1, 1) (see Abel 


Theorem 46). Now, if T(y) = )77.5 a for y € (—1, 1), one has: 


3 - i 1 1 td 
= y — = s | : . 
1l-y? 2 1l1-y 2 1+y 


Thus, 
Lot 
T(y) = sin i 4G 
But C = 0 because T'(0) = 0. 4 us come back to the series in x. The 
convergence set is 


1 
{ze Ri -1<3¢+2<1}=(-1,—3). 
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Its sum is 


S(2) = 782+ 2) = 5in (FF 2) 


for any x € (—1,—4). 


EXAMPLE 9. (arctan series) Let us find the Mac Laurin expansion 
for f(x) = arctan x. For this let us consider 


f= 


where |x| <1 (why?). Apply now Theorem 43 and termwisely integrate 
this last equality: 


=1l-a?4at—...4(-1)"2" +..., 


re aan, ont 
1.3 t C=2- ce +... + (—1)” fs 
(1.3)  arctana + Dae 7 (—1) et 


where |x| <1. For x = 0 we get C = 0. Since for x = 1 the series on 
the right is convergent and since the function 


72) 


iB a0 Ses ont 
S =xr%- = +... +(—1)” “ais 
al ae a a a? 
is continuous at x = 1 (see Abel’s Theorem, iii)), we get that 
1 eee oe | 1 
1.4 tanl =—=1—-—7=+2-—24...+(-1)” ae 
(14) aretanl = 7 Goa (-1)"5= + 


Let us find the convergence set and the sum for the power series 


Ss" n(n + 1)z”. 
n=1 
The convergence radius is 


aa C re 


(why?). Since at « = +1 the series is divergent (n(n +1) ~ 0), 
the convergence set is M, = (—1,1). Let us integrate termwise (see 
Theorem 43) the above series for x € (—1, 1): 


/ bs n(nt+ 1)2”"| dx = SonaeO = So(n +2)2"*t — 92 S- oot, 
n=1 n=1 n=1 n=1 
But the series 
Sa = xr f xe | = 
1-2 
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(it is an infinite geometrical progression). So we get 


1)2"| dx = Q)ertt _— 
‘| yonin \a"| dx Duis a <= 


Let us integrate again this last equality 


{\f Yona no ts dx = (so) Peer, ae 


n=1 n=1 


fa gdan esi (1 —2) 
=> rv T eat n eee ba ie 
1-2 


Coming back and differentiating twice, we get: 


~ 2 
So n(n + NaS Caan fora) <1 


2. Complex power series and Euler formulas 


n=1 


In Chapter 2, Section 2, we introduced the metric space of complex 
number fields C. In fact, C is a normed spaced with the norm given by 
the usual complex modulus |z| = \/x? + y?, where z = x+iy,7,yER 
(prove the properties of the norm for this particular norm!). Since a 
sequence {Z, = Ln + iyn} is convergent to z = x + iy in C if and only 
if both the real sequences {z,,} and {y,} are convergent to x and to 
y respectively (see Theorems 1 and 16), the study of the numerical 
series with complex terms reduces to the study of the real numerical 
series. But this way is not so easy to put in practice. The best way is 
to use firstly the absolute convergence notion like in the case of series 
in a general normed space. Namely, let s = )°> 9 2, be a series with 
complex numbers terms and let S = S°* 9 |z,| be the real series of 
moduli. The following result is very useful in practice. 


THEOREM 47. If the series of moduli S = \°*- 9 |Zn| 1s convergent 
(like a numerical real series with nonnegative terms), the initial series 
with complex terms s = > 4 Zn 18 convergent in C. 


PROOF. Let s, = )o¢9 2% be the n-th partial sum of the series 
s = ry en and let S, = S774 |ze| be the n-th partial sum of the 


n=0 
series of moduli S = }°*° 9 |Zn|. Since 


see =$Sn| S |2nsa| + [2ade| Fes |2n-+pl = Snip — Sn, 
and since the series S is convergent (i.e. the sequence {S,,} is a Cauchy 
sequence), one obtains that the sequences {s,,} is a Cauchy sequence. 
Thus, it is convergent to a complex number s (the sum of the series 


2. COMPLEX POWER SERIES AND EULER FORMULAS 103 


og Zn) in C, because C is a complete metric space (see Theorem 
16). 


The Cauchy Test and the zero Test also work in the case of a com- 
plex series (why?-Hint: C is a complete metric space-why?). Series of 
complex functions and power series are defined exactly in the same way 
like the analogous real case. However, in the complex case, the study 
of the convergence set of a series of function is more complicated than 
in the real case. 


EXAMPLE 10. (Complex geometrical series). Let us find the con- 
vergence set for the complex geometrical series 


s(z) = So =1t2zt2 4... 


n=0 


Let us consider the series of moduli 


S(\2|) = So |z" = eos ielel 
n=0 


This limit exists if |z| < 1. Hence, the series is absolutely convergent if 
and only if |z| < 1. In particular, for |z| < 1, the series is convergent 
(see Theorem 47). Is the series convergent for a z with |z| > 1? Let 
us see ! If |z| > 1, the sequence {z"} goes to co inC = CU {oo}, the 
Riemann sphere (why?), so, the series is divergent (see the zero Test). 
What happens if |z| = 1?, i.e. if z is a complex number on the circle 
of radius 1 and with centre at origin. If z = 1, the series is divergent. 
If z #1, but |z| = 1, the sequence {z"} is never convergent to zero! 
(why?). Thus, the convergence set for the series s(z) = S72" is 
exactly the open disc B(0;1) = {z € C: |z| < 1} in the complex plane 
(ee 


To define the basic elementary complex functions one uses complex 
power series. For instance, the exponential complex function is defined 
by the formula 


Zz zZ z” ae” 
(2.1) exp(z) = 14 et operas ee 


It is easy to prove (do it!) that this series is absolutely convergent on 
the whole complex plane C and absolutely uniformly convergent on any 
bounded subset of C. One can prove that exp(z1+ 22) = exp(21) exp(22) 
for any 21, Z2 in C (see [ST] for instance). 
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The series on the right side of (2.1) is the natural extension of the 
Mac Laurin expansion of the real function exp(x) to the whole com- 
plex plane. Using this "trick" we can define other elementary complex 
functions: 


a 38 2n+1 
’ dep. 2°. .% 2 
2.2 tap ay ap ee 4 
Be) BREN on a 5 On +d)! 
= 3 (aD pened xEC 
“~ (2n + 1)! 
2 44 46 2n 
a a a Ae ee 2 
(2:3) cos(z) = 1 a a Be (—1) (On) +... 
= S- CU" Ec 
— (2n)! 
a aes 
def - z z i= z j \ n—-1% _ 
(2.4) In(l+z) =z ne aie ine (=1) rh 


n=1 i 
A 5 aS RS | ee 
(1+ 2) =14 ie T oy z FT T 
OC DOr 2) We 1) ie, 
| nl ee 
sO, 
(2.5) 


a oe = a(a —1)(a — 2)..(a—n+1) , 

(1+ z) ce a z",|zl|<laeCc 
In the same way we can define any other complex function f(z) if 
we know a Taylor expansion for the real function f(x) (if this last one 
has real values and if it can be extended beyond the real line!). For 


instance, we know that 


go. gb ont 

A(z)=2+—4+—+...4 bc ER., 

i eee ae Gapiy 
We simply define the complex hyperbolic sine as 

3 5 2n+1 
de f a 4 

2.6 Bey =e ae Eee C, 
0) Ae) OE ay a peer See 
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and 


—_ | | | | | 

(2.7) ch(z) = 14 i Ea aweie C, 

We always have to check if the series on the right side is convergent on 
the extrapolated domain (for instance, we extrapolated R to C). The 
restrictions of all these functions to their definition domains on the real 
line give rise to the well known real functions. For instance, In(1 + z), 
|z| < 1, restricted to R give rise to In(1 +). This does not mean that 
we defined the function In(z) for any z 4 0! To define such a function, 
i.e. the inverse of the complex exponential function, is not an easy 
task, because it will be not an usual function, i.e. for a z we have more 
than one value of In(z). This is because exp(z) is not injective at all. 
To see this we need some famous relations, the Euler formulas. 


THEOREM 48. (Euler relations) For any x a real number and for 


i= <VJV-1 we have 


(2.8) exp(iz) = cos(x) + isin(z), 
_ exp(iz) + exp(—iz) 
(2.9) cos(xz) = 5 
and 
aCe exp(ix) — exp(=ix) 


22 
PROOF. We simply use formula (2.1) to compute exp(iz) : 
exp(iz) = 14 it on ar 7 + ... = cos(x) + isin(z). 


If now we put instead of x, —x in the formula (2.8), we get 


(2.10) exp(—ix) = cos(x) — isin(z), 


because cosine is an even function and sine is an odd one. Adding 
formulas (2.8) and (2.10), we get the relation exp(ix) + exp(—ix) = 
2cos(x). Now, subtract formula (2.10) from formula (2.8) and get the 
formula exp(ix) — exp(—ix) = 2isin(z), etc. 


Let us justify now that the complex function exp(z) is not invertible, 
i.e. it cannot have like inverse an usual function. Using Euler formulas 
from the theorem we get that 


exp(2k7i) = cos(2k7) + isin(2k7) = 1, 


106 5. POWER SERIES 


for any integer k. Thus one has an infinite number of complex numbers 
{2nni}, n = 0,41, +2,..., at which the exponential function has value 
1!. This is why the inverse of exp(z) is the multivalued function 


En(z) = In |z| + 7(6 + 2k), & = 0, +1, +2... 


and @ is the argument of z, i.e. the unique real number in {0, 27) 
such that z = |z| [cos @ + isin 6], the trigonometric representation of z 
(prove this last equality by drawing...). It has a double infinite number 
of "branches", i.e. Ln(z) is in fact the set 


{In (z) = In |z| + 7(0 + 2kr)},& = 0, +1, +2,... 


of usual functions. All of these functions have the same real part In |z| . 
For k = 0 we get the principal branch, In(z) = In |z| + iarg z. Some- 
times in books people work with this last expression for the complex 
logarithmic function, without mention this. We leave as an exercise for 
the reader to define the radical complex multiform function ~/z (it has 
only n branches!-find them!). One can start with the fact that </z is 
the inverse of the power n function z ~ z” and with the equality: 


2” = |z|" [cosné + isin nd], 


etc. 
Euler’s formulas from the above theorem are very useful in practice. 
For instance, the famous de Moivre formula 


[cosx + isina]” = cosnz +isinnx 


from trigonometry, can be immediately proved by using the basic prop- 
erties of the complex exponential function : exp(z) exp(w) = exp(z+w) 
(try to prove it!), (exp z)” = exp(nz), where z,w € C, and n is an in- 
teger number. If one extends in a natural way (componentwise!) the 
integral calculus from real functions to functions of real variables but 
with complex values: 


[u@+ig@ar =f peyan+i f g(a)ar. 


one can compute in an easy way more complicated integrals. For in- 
stance, let us find a primitive for a very known family of functions 
f(x) = exp(az) cos(bx), where a, b are two fixed real numbers (para- 
meters). Let us denote by g(x) = exp(az) sin(ba:) (its partner!) and let 
us find a primitive for f(x) + ig(z) : 


[lexv(ar) cos(ba) + iexp(ax) sin(ba)|dx = [ev(ax) exp(ibx)dx = 
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a _ exp(ar+ibr) 
= [ext + ibx)dx = mak © Ca 
__ exp(ax) - [cos(br) + isin(bx)|(a — ib) _ 
7 a? + b? 7 
acos(br) + bsin(br) asin(bx) — bcos(bx) 
= exp(az) Aap + ¢exp(ax) ae 
Hence, 
[extn cos(ba)dx = exp(az) aices\ be) a Pathe) 
Fe Be 
oe in(br) — beos( be) 
: asin(bx) — bcos(bx 
[extn sin(bx)dx = exp(ar) ae 
(why?). 


Another example of a nice application of Euler formulas is the fol- 
lowing. Suppose we forgot the formula for sin3x and of cos 3 in lan- 
guage of sinx and cos respectively. Let us find it by writing 

cos 3a + isin 3x = exp(i3xr) = 
(Euler formula) 


= [exp(iz)]* = [cos x + isin a]® = 


= cos* x — 3cos zx sin? x + i[3 cos” x sin x — sin? z]. 
Since two complex numbers are equal if their real and imaginary 
parts are equal, we get the formulas: 


cos 3x = cos x|cos” x — 3 sin? x] = cos z[4 cos” x — 3], 


sin 3x = [3cos* xsin x — sin® z] = sin z[3 — 4sin? 2]. 


3. Problems 


1. Find the convergence set and the sum for the following series of 


functions: 
A) gE to) 5D) gL) aa be) 


doe (-)" 1; Cg n(3a a 5)”; a pe eeairad 
2. Find the convergence set for the following series of functions: 
a) Gay? (E38) bd) aE cl ae nix”; di) aT} 
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dye) ye, eee CB)” es 
1) enol = (=2)" 0") op (— 1) 82", WY oe (GES) 
ner (—D" Ss m) ea) SS (find its sum); 
3. Use the power series in order to compute the following sums: 


cay oa oat amie alo) ) piers SEP, c)> 7.) 3h; (Hint: associate the power 


series 


a / 
S(a) = Sona" = 2(1420-+80?+...) = a2(ete?+29+...)! <2 ( : :) | 
n=1 


make then z = $). 


CHAPTER 6 


The normed space R’”. 


1. Distance properties in R” 


Motivation Let {O;i,j} be a Cartesian coordinate system in a 
plane (P). To any point M € (P) we associate the position vector 
o> 


OM. We know that there is a unique pair (x,y) of real numbers such 


that OM = xi+ yj. Here i,j are two perpendicular versors with their 
origin in O. Usually one calls (x, y) the coordinates of MV relative to the 


"basis" {i,j}. But we can view (x,y) as an element in R x R ~ R®. If 
M’ is another point in the same plane (P) and if P is the unique point 


— —_—_> = 
in (P) such that OM + OM' = OP, then the coordinates of P are 
(cx+a2',y+y’), where (2’, y’) are the coordinates of M’. Let a be a real 
——> 


number (scalar) and let us denote by OM” the vector aOM. Then, the 
coordinates of the point M” are (ax, ay) € R?. So, one can endow the 
cartesian product R? with a natural algebraic structure of a real vector 
space with 2 dimensions (the number of the elements in any basis of 
it, in particular in the "canonical" basis {(1,0),(0,1)}, where (1,0) 
are the coordinates of the versor i and (0, 1) are the coordinates of the 
versor j). Hence, one can study the 2-dimensional dynamics only in the 
"abstract" space R? (this is the basic idea of R. Descartes; the word 
"cartesian" comes from "Descartes", in Latin "Cartesius"; he invented 
a very useful tool for Engineering, namely the Analytic Geometry; here 
we work with numbers and equations instead of geometrical objects like 
lines, circles, parabolas, etc.). We call R? the 2-dimensional space (2-D 
space). In the same way we can construct the 3-D space R® or, more 
generally, the m-D space 


RO =SR XR Xia XR {X= ($509, 65 Sn) oe RE 
——_ 


n—times 


We recall that if x = (#1, %,...,%m) and y = (1, Yo,---; Ym) are two 
"vectors" in R™, then 


x+y = (414+ YW, Lot Yo, 5 Lm + Ym) 
109 
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and 


GX = (OT O9) «in OF as) 


for any "scalar" @ € R (componentwise operations). For instance, 
(—7,3)+(6,0) = (—1,3) and V2(-1, 1) = (—v2, V2). To do analysis in 
R™ means firstly to introduce a distance in R™. R™ has the "canonical 
basis" 


£1 Oss 0)(0,1. 0) = 0),2: (050: <0, 1) 


like a real vector space, so it has the dimension m over R. It is more prof- 
itable to introduce first of all a "length" of a vector x = (21, £2, ..., 2m) 
by the formula 


(1.1) IIx|| <2 4/02 +02 +... +22, 


The nonnegative real number ||x|| is called the norm or the length of x. 

If m = 1, the norm of a real number z is its absolute value (modulus) 

|z|. If m = 2 and if x = (21,22) the norm ||x|| = \/z7 + 22 is exactly 

the length of the diagonal of the rectangle [OA,M Ag], or the length of 
——> — —> 

the resultant vector OM = OA, + OAg (see Fig.6.1). 


yA 
A? @----------2-------25 r M(x1,X2) 
X2 ¢ | 
aa 
O x} Ay X 
Fig. 6.1 


In the 3-D space R® the norm of x = (21, 22,73) is \/x? + 2+ 73 
and it is exactly the length of the diagonal of the parallelepiped gener- 


ated by OA, OA, and OAs (see Fig.6.2). 
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z{ 


A3 


M(x1,X2,X3) 


Fig. 6.2 


EXAMPLE 11. (the space-time representation) Let us consider the 


vector X = (21, %2,73,t) € R*, where (x1, 22,23) are the coordinates of 
a point M(21,%2,23) in the 3-D space and t > 0 is the time when we 
"observe" the point M. Then 


xl| = /28 +03 +03 +2 


EXAMPLE 12. (the space of dynamics) Let us consider a moving 
point M on a trajectory (7) in the 3-D space. The position of M is 
fixed by its coordinates x1, %2,2%3. Its velocity v is given by another 
3 coordinates #1,%2,x3, the derivatives of the coordinates functions 
x(t), ro(t),73(t) at M. Thus, the "dynamic" state of M is described 
by the "vectors" 


, ; : 6 
x= (£1, Lo, X3, L1, Lo, L3) ER 


and 


+2 2 2 
Ix|| = 2? bps 0g hay ag th ay: 


THEOREM 49. The norm mapping 


x ~ |[x|] = 2} +23 +. +23, 


from R™ to R,, has the following main properties: 1) ||x|| = 0 if 
and only if x =0; 2) ||ax|| = |a|||x|| for any a € R, x ER”; 3) 
IIx + yl] < |lxI| + llyll, for any x, y ER”. 


A 


PROOF. 1) and 2) are obvious (prove them!). To be clearer, let us 
prove 3) for m = 2 (for m > 2 one can use the Cauchy-Buniakovsky 
inequality, which can be found in any course of Linear Algebra!). Both 
sides in 3) are nonnegative, so the inequality is equivalent to 


2 2 2 
Ix yl” < [xl + My + 2 [xl lyll- 
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If x = (1,22) and y = (yi, ye), one has 


(11 +yn)? + (wa + ye)? < 0h +03 + y2 +98 + 24/(0? + 02) (y? + 92), 


or, 21y1 + Layo < \/(a7 + x3)(y? + y3). By squaring both sides we get 
Qari royiy2 < xeyy + ciys, 


or 0 < (rey; — %1y2). This last inequality is obvious. Moreover, from 
this last inequality, we can say that in 3) we have equality if and only 
if roy1 — 112 = 0 or, if and only if (#1, 72) = A(y1, ye), i.e. X and y are 
collinear. 


The couple (R”,||.||) is called a normed space. We know that in 
general, a normed space is a real vector space X with a norm mapping 
||.|| on it, which verifies the properties 1), 2) and 3) from Theorem 49. 
We recall that a normed space (X,||.||) is also a metric space w.r.t. 
a canonically induced distance: d(x,y) = ||a — y|| for any x,y in X. 
In the case of the normed space (R”, |].||) the distance is given by the 
formula 


(1.2) d(x,y) =| —yll =.) Soi — vi)? 


i=1 


This distance is a very special one because it comes from the "scalar 
product" 


(1.3) <xy>= SS" Biv 
i=1 

i.e. this last one induces the norm ||x|| =< x,x >=./)>-)., 7? on R™ 
and this norm gives rise exactly to our distance (1.2). As we know from 
the Linear Algebra course, the scalar product (1.3) endows R™ with a 
geometry. The length of a vector x is its norm ||x|| = \/>2;", 7;? and 
the cosine of the angle @ between two vectors x and y of IR” is defined 
as 


<x,y> 
cos 9 = ————_. 
IIx Ty 
The fact that the quantity BaisT is always between —1 and 1 is exactly 


the famous Cauchy- Schwarz-Buniakowsky inequality 
(1.4) I< x,y >| < [xl Ilyl]- 


It can be proved only by using the basic properties of a scalar product 
(see any course in Linear Algebra). 
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Since R™ is a metric space relative to the distance d defined in (1.2) 
we can speak about the convergence of a sequence 


S22, eee) 


from R™ to a vector x = (£1,229, ...,2m) : we say that x™ — x if and 
only if d(x,x) — 0, i.e. if and only if 


— 0, 


ne 
op 
| 
= 


when n — oo. But, a sum of squares becomes smaller and smaller if 
and only if any square in the sum becomes smaller and smaller. Thus, 
we just obtained a part of the following basic result: 


THEOREM 50. (componentwise convergence). 1) A sequence 


{x = (2 2, ...,2™)} 


ce rim 


of vectors from R™ is convergent to a vector xX = (%1,%2,...,2m) if 


and only if for any i = 1,2,...,m, the numerical sequence {a} is 
convergent to x;, when n — oo. 2) A sequence 


{x) = (2 2, ...,2)} 


is a Cauchy sequence in R™ if and only if any "component"” 2”, {a}, 
is a Cauchy sequence in R for any i = 1, 2,...,m. Since R is a complete 
metric space (see Theorem 13), we see that R™ is also a complete metric 
space. 


PROOF. 1) was just proved before the statement of the theorem. 


For 2) let us consider a sequence {x‘") = (x\” Ms ast 3 o™)h., It isa 
Cauchy sequence if for ay € > 0 we can find a rank N- such that if 
n > N, one has that d(x'+?),x(™) < e for any p = 1,2,.... This 


means that whenever n is large enough the distance d(x'*?), x) is 
small enough, independent on p. But 


(1.5) AG a) 2S (Ge aay. 


11, 


a becomes small enough, independent on p whenever 
n is large enough. And this is true for any fixed i = 1,2... . But 
this last remark says that the sequence {x\")} is a Cauchy sequence 


for any fixed i = 1,2,.... Conversely, if all the sequences {2} are 
Cauchy sequences for i = 1,2,..., then, in (1.5), all the differences 
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(n+p) = 


x a” become smaller and smaller, independent of p, whenever 


n becomes large enough. Hence, the whole sum )7\" ,(z; z; 
becomes smaller and smaller, independent of p, whenever n — on, i.e. 
the sequence {x‘")} is a Cauchy sequence in R™. The last statement 


becomes very easy now (why’). 


(n+p) _ (m)y2 


For instance, the sequence {(+, **)} is convergent to (0,1) in R? 


because the first component {+} goes to 0 and the second component 
wat goes to 1. 

A normed vector space, which is a complete metric space w.r.t. the 
distance defined by its norm, is called a Banach space. Such spaces are 
very useful in many engineering models. 

We recall now, in our particular case of the metric space (R”, d), 
where d is defined in (1.2), the following basic notion. 


DEFINITION 16. Let a =(d1,2,...,@m) be a fixed point in R™ and 
let r > 0 be a positive real number. The set B(a,r) = {x € R™”: 
|x — al] = d(x,a) < r} is called the open ball with centre at a and of 
radius r. The set 


Bla,r] = {x € R™: ||x — all = d(x,a) <r} 


is said to be the closed ball with centre at a and of radius r (> 0). 


For instance, ifm = 1, a=a € R then B(a,r) = (a—r,a+r), the 
usual open interval with centre at a and of length 2r (prove this!). In 
the same case, Bla,r] = [a -—r,a+r]. If m = 2, B(a,r) is the usual 
open (without boundary!) disc, with centre at the point a = (a), a2) 
and of radius r. If m = 3, B(a,r) is the common 3-D open (without 
boundary) ball (a full sphere!) with centre at a = (a1, a@2,a3) and of 
radius r. The closed ball Bla,r] is exactly the full sphere of radius r 
and with centre at a, which contains its boundary 


S = {(x,y, 2): (@— a1)’ + (y— a2)" + (z — a3)” = rf. 


This last surface S is usually called the sphere of centre a and of radius 
i 


Let D be an arbitrary subset of R™. A point d of D is said to be 
interior in D, if there is a small ball B(d,r), r > 0 centered at d such 
that B(d,r) C D. All the interior points of D is a subset of D denoted 
by IntD, the interior of D. It can be empty. For instance, any finite 
set of points has an empty interior. 


DEFINITION 17. A subset D of R™ is said to be an open subset if 
for any a in D there is a small r > 0 such that the open ball B(a,r) 
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with centre at a and of radius r is completely contained in D, i.e. 
B(a,r) C D. A subset E of R™ is said to be closed if its complementary 


Eo! R™ Be {x eR”: x ¢E} 


in R™ is an open subset of R™. 


For instance, any point or any finite set of points are closed subsets 
of R™. If m = 1, the closed intervals are closed subsets of R. Moreover, 
an open ball is an open set and a closed ball is a closed set (prove it 
for m = 1,2,3!). It is not difficult to prove that a subset D of R™ is 
open if and only if it is equal to its interior. The boundary B(D) of 
a subset D of R” is by definition the collection of all the points b of 
R” such that any ball B(b,r), centered at b and of radius r > 0 has 
common points with D and with the complementary R™ \ D of D. For 
instance, the boundary of the disc {(x, y) : x? + y? < 1} is the circle 
{(x,y) : x? +y” = 1} (prove it!). It is easy to see that D is closed if 
and only if it contains its boundary. The set DU B(D) is called the 
closure of D. It is exactly the union of all the limits of all convergent 
sequences which have their terms in D. 


REMARK 17. The set O of all the open subsets of R™ has the fol- 
lowing basic properties: 

1) @, the empty set, and the whole set R™ are considered to be in 
O. ? 

2) If Di, De, ..., Dp are in O, then their intersection Pi is also 
in O. 

3) If {D.} is any family of open subsets in O, then their union 
UD, is also in O, t.e. it is also open. We propose to the reader 


to prove all of these properties and to state and prove the analogous 
properties for the set C of all the closed subsets of R™. Mathematicians 
say that a collection O of subsets of an arbitrary set M, which fulfil 
the properties 1), 2) and 3) from above, gives rise to a topology on M. 
For instance, in a metric space (X,d), the collection O of all the open 
subsets (the definition is the same like that for R™!) gives rise to the 
natural topology of a metric space of X. A set M with a topology O on 
it (a collection of subsets with the properties 1), 2) and 3)) is called 
a topological space and we write it as (M,O). This notion is the most 
general notion which can describe a "distance" between two objects in 
M. For instance, if (M,O) is a topological space and if a is a "point" 
(an element) of M, then an element b is said to be "closer" to a then 
the element c, if there are two "open" subsets D and F' of M such that 
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a,b€ D,a,c€ F and DC F. Meditate on this fact in a metric space 
X, for instance in the usual case X = R. 


Now, if (X, d) is a metric space, the definition of an open ball B(a,r) 
with centre at an element a of X and of radius r > 0 is similar to the 
definition of the same notion in R™. Namely, 

Bia,r) ={x EX: d(x,a) <r}. 
In the same way, a subset D of X is said to be open in X if for any 
a € D there is an open ball B(a,r) = {x € X : d(x,a) < r}, with 
centre at a and of radius r > 0, such that B(a,r) C D. A subset FE of 


X is called a closed set if its complementary D = X \ E in X is an 
open set of X. 


THEOREM 51. (a closeness criterion) A subset E of a metric space 
(X,d) (in particular of X = R™) is closed if and only if any sequence 
{tn} of elements in E, which is convergent to an element x of X, has 
its limit x also in E. 


PROOF. Let us assume that E is closed and let {,,} be a sequence 
of elements in E’ which is convergent to an element x of X. If x were 
not in EF then, since D = X \ E is open, we could find a ball B(z,r) 
with r > 0, such that B(z,r) C D, i.e. B(x,r)NE = ©, the empty set. 
But, since 7, — x, ie. d(an,x) — 0, for n large enough, d(x, x) <r, 
or x, € B(x,r). Since all the terms z,, are in EF, we succeeded to find 
at least one element x, € B(x,r) NM E = ©, which is a contradiction. 
So, x itself must be in FE. 

Conversely, we suppose now that any sequence of elements of FE 
which is convergent to an element x of X has its limit x in E. If E 
were not closed, D = X \ E were not open. This means that there 
is at least one element y of D such that any small ball B(y, +) cannot 
be contained in D. Hence, for any natural number n > 0, one can 
find at least one element y, € B(y,+) E (why?). This means that 
d(Yn,y) < + and that y, € E for any n = 1,2,.... Since y, > y (why?) 
and since F’ has the above property, we see that y must be also in E. 
But,.... y was chosen to be in D = X ~\ E, so it cannot be in E! We 
have a new contradiction! So, we cannot suppose that D is not open, 
i.e. we are forced to say that EF is closed and the theorem is completely 
proved. 


DEFINITION 18. Let A be a nonempty subset of R™ (or of an arbi- 
trary metric space (X,d)). By the closure A of A in R™ (or in X) we 
mean the set of the limits of all the convergent sequences with terms in 


A. 
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In particular, any element a of A is in A (take the constant sequence 
a,a,a,... ,etc.). We can easily see that A is the least closed subset of 
X (in particular of R™) which contains A (use Theorem 51). 


REMARK 18. A is closed if and only if A = A. The closure of 
the open ball B(a,r) in a metric space (X,d) is exactly the closed ball 
Bla,r]. The operation A ~» A has the following main properties: 1) 
ANBCANB, 2) AUB=AUB, 3) AUB(A) = A, where B(A) = 
{rEX: Biz,r) NAF @ and B(z,r) N(X~ A) # @ for anyr > 0} 
is the boundary of A in X (prove all these statements!). 


We naturally extend the definition of a limit point for a subset A 
of R (see Definition 4) to a subset of an arbitrary metric space (X, d). 
Let A be a nonempty subset of a metric space (X,d) (in particular 
of R™). An element x of X is said to be a limit point for A if there is a 
nonconstant sequence {x,,} with terms in A which is convergent to x. 

For instance, (0, 0) is a limit point for the half-plane {(2, y) : y > O}. 
But (0,—0.0001) is not a limit point for the same subset in X = R?. 
The subset {(n,m) : n,m € N} of R? has no limit points. The set of 
all the limit points of a subset A of a metric space (X,d) together the 
subset A itself is exactly the closure A of A (why?). The set of all the 
limit points of the closed cube C' = [0,1] x [0,1] x [0,1] is the cube C 
itself. But,...the set of all the limit points of an arbitrary closed subset 
is not always the set itself. For instance, the set of all limit points of 
a point a of X is the empty set (which is distinct of {a}). A sequence 
{tn} has exactly only one limit point x, if and only if the sequence has 
an infinite distinct values and it is convergent to x. 


DEFINITION 19. A nonempty subset A in a metric space (X,d) is 
said to be bounded if there is a "reference" element c © X and a positive 
real number M such that d(c,x) < M for any element x of A. 


REMARK 19. It appears that the definition depends on the choice 
of the "reference" element c, i.e. that the boundedness of A is a c- 
boundedness. In fact, the definition does not depend on the element 
c. Namely, if a subset A is bounded relative to an element c of X, 
it is bounded relative to any other element b of X. Indeed, d(b,x) < 
d(b,c) + d(c,x) < d(b,c) + M, which is a fixed positive number w.r.t. 
the variable element x of A. Hence, A is also b-bounded. In a normed 
space (see Definition 13) we take as a "reference" element c the element 
c = 0. Thus, A is bounded in a normed space (X,||.||) if and only if 
there is a positive real number M such that ||x|| < M for any x of A. 


Cesaro-Bolzano-Weierstrass Theorem (see Theorem 12) has an ex- 
tension to R™ for any m = 2,3,... . 
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THEOREM 52. (Bolzano- Weierstrass Theorem). Let A be a bounded 
and infinite subset of R™. Then A has at least one limit point in R™. In 
particular, any bounded sequence in R™ has a convergent subsequence. 


PROOF. To understand easier the idea behind the formal proof of 
this theorem, we shall take the particular case m = 2 (the case m = 1 
was considered in Theorem 12). So, A is an infinite (contains an infinite 
number of distinct elements) and bounded subset of R?. Any element of 
A is a couple (x,y), where x,y € R. Since A is bounded by a positive 
real number M, we can write ||(z,y)|| < M, for any pair (x,y) of 
A, or \/x? + y? < M. Thus, the projections of A on the coordinates 
axes, A; = {a, © R: there is an a2 € R with (a,,a2) € A} and 
Ay = {bo € R: there is a b; € R with (b,,b2) € A} are bounded in 
R (prove it and make a drawing!). Since A is infinite, at least one of 
A, or Ag is infinite (why?). We suppose that A, is infinite. Let us 
apply now Cesaro-Bolzano-Weierstrass Theorem (Theorem 12) for the 
subset A, of R. Hence, there is a limit point x; for Aj, i.e. there is a 


sequence {a} of elements in A;, which is convergent to 7. Let us 
look now at the definition of A,! For any al”), n =1,2,..., we can find 


an element a” in R such that the couple (2, a”) is in A. In fact, 


the sequence {2} is bounded and its terms belong to Ag (why?). If 
Ag is also infinite, applying again Cesaro-Bolzano-Weierstrass theorem 
to the subset {2}, we get a limit point x» of this last sequence. This 


means that we can find a subsequence {ar} of {a} Ric he x 


) which is convergent to x. For any k,, n = 1,2,..., we consider the 


(kn 


term 2} ) of the sequence {2} just found above. We obtain a new 


sequence {(o™, a{k))} of elements from A, which is convergent to the 
pair (71,22) (why?...because it is componentwise convergent!). Thus 


(x1, 22) is a limit point of A. What happens if A, is finite? Then, at 


least one term ao) repeats itself of an infinite number of times. We 


suppose that for hy < hg < ... one has that ahh) = a), for any 


hn hn 
gf 


n = 1,2,.... So, the sequence {(x }, with terms in A, is 


convergent to (x1, 25), which becomes in this way a limit point for 
A. A question can arise here: why can we choose all the elements of 


the sequence {(0l™, e”) 


} to be distinct one to each other? Because 
the sequence {x} can be chosen from the beginning to contain only 
distinct elements (A, is infinite!). Hence, in both cases A has a limit 


point and the proof is completed. 
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We shall see in future the fundamental importance of this theoreti- 
cal result. A limit point is also called in the literature an accumulation 
point. 

Since the bounded and closed subsets in a space of the form R™ 
are very useful in many applications, we shall call them compact sets. 
For instance, [a,b], {(z,y) : 2? + y? < r?} and, generally, any closed 
balls, are all compact sets in their corresponding arithmetical spaces 
of the type R™. A finite union and any intersection of compact sets is 
again a compact set (prove it!). An infinite union of compact sets is not 
always a compact set (find a counterexample!). For instance D = {+} 
is bounded but it is not closed because 4 — 0 and 0 is not in D. So, 


D is not a compact set but,...its closure D = {0} U {4} is a compact 
subset in R (prove this!). Any finite set of points in R™ is a compact 
set (why?). 

Now we give a useful characterization of compact sets in R”. 


THEOREM 53. A subset C of R™ is a compact set if and only if any 
sequence of C' contains a convergent subsequence with its limit in C. 


PrRoor. We suppose that C is a compact set in R™ and let {x”)} be 
a sequence with terms in C. If {x‘)} has an infinite number of distinct 
elements, A = {x} being bounded (A C C and C is bounded), w 
can apply Theorem 52 and find that there is a convergent subsequence 
{x(k of {x}. Since C is closed, the limit of {x‘*")} belongs to C 
(see Theorem 51). If {x‘")} has only a finite number of distinct terms, 
one of them appears in an infinite number of places. So, we take the 
constant subsequence generated by it. 

Conversely, we assume that C’ has the property indicated in the 
statement of the theorem. Let us prove firstly that C' is bounded. If 
it were not bounded, for any n = 1,2,... one can find a vector a, in C 
such that |la,|| >. The hypothesis says that the sequence {a,,} has 
a convergent subsequence {a,,}. Let a = Jim a,,, be the limit of the 


sequence {a,,,}. Then 


Rn a | ak, 


< lar, — all + [lal]. 


Taking limits in the extreme sides of these inequalities, we get: oo < 
\|al| , a contradiction. Hence, C’ must be bounded. Let us prove now 
that C is closed by using again Theorem 51. For this, let {y,}— y be 
a convergent to y sequence with elements in C' and its limit y in R” 
By the hypothesis on C’, the sequence {y,,} has a subsequence {yj,,, } 
which is convergent to an element z of C. Since {y,,} is convergent to y, 
any subsequence of {y,,} is also convergent to y. Indeed, let us prove 
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for instance that z = y. For this, let us evaluate d(z,y), the distance 
between z and y: 


(1.6) d(Z,¥) SA(Z,Vkm) + AVkims Yn) + UYns¥); 


where m and n are arbitrary chosen. If we make m,n — oo in this 
last inequality, we get that d(z, y) =0, i.e. z = y (why?). Here we just 
used the fact that a convergent sequence is also a Cauchy sequence, i.e. 
for m,n large enough, the distance d(ym, Yn) goes to zero. Now, since 
z is in C' we get that y is also in C, i.e. C’ is closed and the theorem is 
proved. 


The above characterization of compact subsets of IR” leads us to the 
introduction of the notion of a compact subset in an arbitrary metric 
space (X,d). We say that a subset C' of X is compact if any sequence of 
elements from C' has a subsequence which is convergent to an element 
of C. 

For instance, any convergent sequence {z,,} in a metric space X, 
together with its limit x is a compact subset of X (prove it!). Thus, 
C = {a,}U {x} is a compact subset of X. 


2. Continuous functions of several variables 


Let A be anonempty subset of R”, the "arithmetical" n-dimensional 
vector space and let f : A — R, be a function defined on A with values 
in R. Since the variable x = (21, £2,...,%p) is a vector determined by 
n free scalar quantities, 71, %2,...,%, we say that our function is a 
function of n variables. If n > 2, we say that f is a function of 
"several" variables. Since the values of f are scalars (real numbers), 
we say that f is a scalar function of n variables. A map f : A — R” is 
called a vector function of n variables. This time, the values of f are 
m-dimensional vectors. Hence f(x) = (y1, y2,---;Ym) and we see that 
the numbers 1, Y2,---,;Ym are themselves functions fi, f2,..., fm of x: 
Yi = filX),.--, Ym = f(x). These scalar functions f;, fo, ..., fm, defined 
on A with values in R this time, are called the components of f. We 
write this as: f =(f1, fo,..., fm) and interpret it as a "vector" of m- 
components (coordinates) f1, f2,..., fm. In applications f is also called 
a vector field of n variables. "Field" comes from "field of forces". For 
instance, 


Is R? an R?, f(a, y) 7 (xy, © = y) 
is a vector field in plane (R?) of 2 variables. Its components are 
fi(x,y) = xy and fo(x, y) = x—y. We can give its image in some points. 
For instance, we can translate the vector f(2,3) = (2-3,2—3) = (6, —-1) 
at the point (2,3) and so we get "the image" of f at (2,3). In this way 
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we can fill the whole plane R* with vectors (forces), i.e. we get a 
"field" of forces on the whole plane. If n = 1, the image of a vec- 
tor field f : A — R™ (A C R) is a "curve" in R”. For instance, 
f(t) = (Roost, Rsint), t € [0,27) has as image in the plane R? the 
usual circle of radius R and with centre at the origin (0,0). We say 
that the two components of f, f;(t) = Rcost and f(t) = Rsint are the 
parametric equations of this circle. One also write this as: x = Rost, 
y = Rsint, t € [0, 27). We can also interpret the image of a vector field 
f : [0,7] — R” (m = 2 or m= 3) as the trajectory of a moving point 


M(fi(t), fo(t), Hg) Fm(t)) 


where t measures the "time" between the starting moment (usually 
t = 0) and the ending moment t = T. For instance, f(t) = (t,t?), 
t € A= [0,10], is a parabolic trajectory, along the arc of the parabola 
y = 2’, x € [0,10]. The new vector field 


f(t) = (filt), fat), fn (t)) 


(the componentwise derivative), associated to the vector field 


F(t) = (fit), folt), --- fm(#)), # € [0,7], 


is called the velocities field of the field f. 

In order to describe the "breaking" phenomena at a given point 
a =(d1,02,...,@,) of R”, we need to see what happens with the values 
of a vector function (which describes our phenomenon) f : A — R", 
whenever we becomes closer and closer to a. For this, a must be a limit 
point of the definition domain A. We have to study the convergence of 
the sequence of vectors {f(x”)} in R™, whenever the sequence {x}, 
with terms in A, converges to a in the metric space R”. The most 
convenient situation is that when all the values {f(x))}, for all the 
sequences {x‘")}, which are convergent to a, become closer and closer 
to one and the same vector L from R™. This is why we give now the 
following definition. 


DEFINITION 20. Let A be a subset of R" and let a =(a, dg, ..., dn) 
be a limit point of A. We say that L € R™ is the limit of a vector 
function f : A — R™ at the point a (write L =limf(x)), if for every 


sequence {x}, x™ 4a, x( € A, which is convergent to the vector 
a, one has that the sequence of images {f(x))} of {x} through f is 
convergent to L. If such an L exists, independently on the choice of the 
sequence {x}, we say that f has limit L at a. This limit L depends 
only on f and ona. 
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If there is such a common limit L, this is unique, because the limit 
of a sequence in a metric space is unique (if it exists!). 

For instance, let us compute | ae - f(x,y), where 

LY ae 
f(z,y) = ey +e" + In(a? +y’). 

Let us take a sequence {(%p, Yn) } which is convergent to (—1,2). This 
means that 7, — —1 and y, — 2 (see Theorem 50). But we know 
that the "taking limit" operation is compatible with the multiplication, 
addition and with the logarithm function (we say that In is continuous!) 
(see also Theorem 14). Hence, 


f(@n; Yn) = LnYn + co. + In(a%, en yi) 
will be convergent to 
(S1)eOeh ly in((E1 yo") = Sane is. 


We see that this limit is independent on the starting sequence (Zn, Yn) 
which tends to (—1,2). Thus, for any sequence (2, Yn) which is con- 
vergent to (—1, 2), 


Con yt gyt Pm Yo) = —1+1n5. 
In fact, we see that for any sequence (2p, Yn) which is convergent to 
(—1, 2), 
lim t Gates = f(H—12). 


(@n,Yn)—(—1,2) 
This happens, because any elementary function of several variables is 
"continuous" (see the bellow definition) on its definition domain. 


DEFINITION 21. Let A be a subset of R” and let a =(ay, da, ..., An) be 
a point of A. We say that the vector function f : A — R”™ is continuous 
at the point a, if for every sequence {x ist of A, x # a and which 
is convergent to the vector a, one has that the sequence of the images 
f{£(x™)} of {x™} through £ is convergent to f(a), the value of f ata. 
We say that f is continuous on the set A if f is continuous at any point 


of A. 


We see that f is continuous at a point a if and only if it has a 
limit L at a and this L is equal to f(a), the value of f at the point a. 
The above definition is in accordance with the engineers perception of 
approximation processes. Let us suppose that f describes a physical 
phenomenon P and we are interested in the variation of this phenome- 
non around a fixed "point" (vector) a. Let us take a neighboring point 
z of a and let us approximate z by a. In this case, can we approximate 
f(z) by f(a)? Or, can we consider that P is "almost the same" at z like 
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at a?. We can do this if f is continuous at a. Otherwise, we cannot do 
such approximations. We must be very careful for instance, in the case 
of earthquake models around the so called "singular" points (see the 
example bellow). Now we think that the reader is convinced that the 
continuity notion is important in modelling the physical phenomena. 
It is not difficult to prove that all the elementary functions and their 
compositions are continuous functions. In the following we supply with 
an example in which we shall see that the case of vector fields of several 
variables (for n > 1) is more complicated then the case of one variable. 
Let us see now if the following nonelementary (why?) function 


_v, ife #0, ory #0 
= ety? ? ’ ? 
flay) { 0, if x = 0 and y = 0, 


: R? | R, is continuous or not on the whole R?. If (a,b) 4 (0,0), 
then f(x,y) = = on a small disc (not containing (0,0)) with centre 
at (a,b) (and a small radius). Since the restriction of f to this last disc 
is an elementary function, f is continuous at (a,b). What happens at 
(0,0)? Ifthe function f were continuous at (0,0) then, for any sequence 
(Zn, Yn) which tends to (0,0) (ie. 2, — 0 and y,, — 0), we should have 
that f(@n, Yn) — f(0,0) = 0. Let us take a nonzero real number r and 
let {x,} be an arbitrary sequence of nonzero real numbers which is 
convergent to 0. Take now y, = rv, for any n = 1,2,.... This means 
that all the pairs (,,,y,) are on the line y = rz (its slope is r) and 
that the sequence {(Zn, Yn)} is convergent to (0,0). But 


rx r 


f (ns Yn) = = # 0 


ee trea? 149? 


So the function f is not continuous at (0,0). Moreover, since the limit 
i: 


lim Ca, Ui) 
ee arte a ) 1 + fe 


is dependent on the slope r of the line y = rx, on which we have 
chosen our sequence (Zn, Yn), we see that the function f has no limit 
at (0,0). Hence, we cannot extend f "by continuity" at (0,0) with no 
real value. Such a point (0,0) is called an essential singular point for 
f. This means that if we become closer and closer to (0,0) on different 
sequences {(%n,Yn)}, we obtain an infinite number of distinct values 


for the limit ne ‘ of (Pm Yn) (as we just saw above!). 
Ln Yn)W, 


The following criterion reduces the study of the limit or of the 
continuity of a vector function f : A — R™” at a point a €A, where A is 
an open subset of R” and f = (fi, fo, ..., fm), to the study of the same 
properties for the scalar functions f;, fo, ..., fm- 
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THEOREM 54. With these last notation, 1) f = (fi, fo,..., fm) has 
the limit L = (Ly, Lo,..., Lm) at the point a if and only if every com- 
ponent function f; has the limit L; at the same point a, for j = 1,2,... 
and 2) f is continuous at the point a if and only if every component 
function f; 1s continuous at a. 


PROOF. Everything comes from the fact that the convergence in 
the normed spaces R™ is a componentwise convergence (see Theo- 
rem 50). Indeed, let us assume that f = (f1, fo,..., fm) has the limit 
L = (1h, L,..., Lm) at a. Hence, for any sequence {(x™)} which is 
convergent to a, one gets that lim f(x") =L, ie. lim f;(x™) = L; 
for j = 1,2,... (we just applied the "componentwise" principle). The 
existence is included here! (why?). Conversely, if for any 7 = 1,2,..., 
the limit lim f;(x(”)) = L; exists, then the limit lim f(x”) = L exists 
and L = (Ly, L»,..., Lm). We add the fact that f = (fi, fo,..., fm) is 
continuous at a if and only if 


L= (L1, La, ses lign) = f(a) = (fila), fo(a), septate) ) 
or if and only if f;(a) =L,; for any j = 1,2, 
the continuity of every f; at a for 7 = 1,2, 


. But this means exactly 


Using this last continuity test, we can easily decide if a vector func- 
tion is continuous or not. For instance, 


is continuous on R® because all the scalar component functions 
fil2,y, 2) = @, fo(z,y, z) = 24 + y 


and f3(x,y, z) = 2x + 3y — 2z are polynomial functions so, they are all 
continuous on R?. 


REMARK 20. The existence of a limit at a point and the continuity 
at a point are "local" properties. They are defined "around" a given 
point a. If we fix a n-D continuous curve y : [a,b] — A C R" and 
if a= (to) is a point "on 7" (it is in the image of y), we say that 
a vector function f = (fi, fo,..., fm), defined on A with values in R™ 
is continuous at a along the curve y if the composed function foy : 
[a,b] — R™ (a new curve in R™) is continuous at to. This means 
that if we take any sequence of points {x\} in A (is considered to be 
opened!) on y (x™ = y(tn)), which becomes closer and closer to a, 


then lim f(x") = f(a). For instance, 


ry . 
_f te fr £0, ory £0, 
F(a) { 0, if x =0 andy =0, 
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f :R? —R, is not continuous ata = (0,0), but it is continuous at (0,0) 
along the both axes of coordinates. It iis limits along any other fixed 
line y = rx which is passing through (0,0), but the limits are not the 
same! (see the above commentaries on this example). It is possible to 
construct a function of two variables which is continuous on R? except 
the origin, where it has limit 0 along any line which passes through 
(0,0), but tt has no limit at (0,0) (find such a function!). 


THEOREM 55. The composition between two continuous functions 
is also a continuous function. 


PROOF. Let A be an open subset of R?, let B be another open sub- 
set of R” and let f: A — B, g: B — R” be two continuous functions 
on their definition domains. The theorem says that the composed func- 
tionh: A— R”, h=gof, i.e. h(x) = g(f(x)) for any x € A, is also 
a continuous function on A. For proving this, let us take a point a € A 
and an arbitrary sequence {x )} in A which is convergent to a w.r.t. 
the distance of R?. Since f is continuous on A, in particular, it is also 
continuous at a. So, the sequence {f (x(”)} is convergent to f(a). Now, 
since g is continuous on B, in particular, it is continuous at the point 
f(a) of B. Hence, the sequence {g(£(x"))} tends to g(f(a)) = h(a) 
and so, h(x”)= g(f(x”)) is convergent to h(a). This means that the 
composed function h is continuous at a. Since a was arbitrary chosen 
in A, we have that h is continuous on the whole A. 


This theorem is very useful, because almost all the functions com- 
monly used in applications are compositions of elementary functions 
and these last ones are continuous on their definitions domains. For 
instance, 


St cos | x+sin ry | 


1 + In(z? + y?) 


is defined on R?\y, where 7 is the circle: x?+y? = +, where e = 2.71... . 
Here f is the composition between the following continuous functions: 


x 
v ~~ COS DL, (x,y) ~~, Y # 0, (x,y) ee Te (x,y) ~~ LY; 
i] 


x~ sing andx~lInz, x >0 


(prove everything slowly!). The same theorem is used to prove that the 
set of all continuous functions defined on the same set A (open, closed, 
etc.) is a real infinite dimensional (contains polynomials!) vector space 
(prove it!). 
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3. Continuous functions on compact sets 


Let A be an arbitrary nonempty subset of R” and let f : A — R™ 
be a continuous function (on the whole A). Let D be an open subset of 
R” which is contained in A. Here is a question: "Is always the image 
f(D) of D through f open in R™? We shall see by simple examples that 
the answer is no! Let us take, for instance, D = (0,1) and f(r) = 3 
for any x in (0,1). Since the set {3} is closed in R (why?), f(D) is not 
open. Let now EF be an open subset of R” and f-'(E) = {x EA: 
f(x) € A}, the preimage of FE in A. We say that a subset B of A is 
open in A if it is the intersection between A and an open subset D of 
R”, ie. B= AND. For instance, B = (0, 1] is not open in R (why?), 
but it is open in A = [—1, 1] because, D = (0,3), which is open in R, 
intersected with A is exactly B. 


THEOREM 56. With the definitions and notation given above, f : 
A — R™ is continuous if and only if £-1(E) is open in A for any open 
subset of R™, 1.e. if f carries back the open subsets of R™ into open 
subsets of A. 


PROOF. a) We assume that f : A — R" is continuous and that 
E is an open subset of R™. To prove that f~'(E) is open in A it is 
equivalent to prove that C = A\f~'(EF) is closed in A, ie. for any 
convergent sequence {x‘")} of elements in C, convergent to an element 
x of A (pay attention!), one has that x is also in C. If it were not in 
C, f(x) € E. Since E is open in R™, there is a small ball B(f(x),r), 


with center at f(x) and of radius r > 0, which is contained in FE. Since 
(»)) 


x(") 5 x, and since f is continuous, one has that f(x‘”’) is convergent 
to f(x). So, there is at least one x() with f(x) in B(f(x),r), Le. in 
E. So, x") is in f-1(E), a contradiction, because we have chosen the 
sequence {x‘")} to have all its terms in C, i.e. not in f~'(£). 

b) We suppose now that f carries back the open subsets of R™ into 
open subsets of A. Let us prove that f is continuous at an arbitrary fixed 
point z. For this, let {z‘”)} be a sequence in A which is convergent to 
z © A. We assume that {f (2”)} is not convergent to f(z). Then, there 
is a small ball B(f(z),r) in R™ such that an infinite number {£(z”))} 
, n = 1,2,..., of the terms of the sequence {£(2'")} are outside of 
B(f£(z),r). Since B(f(z),r) is an open subset in R™, following the last 
hypothesis, we get that the set D = f~'(B(f(z),r)) is an open subset 
of A which contains z (why?). Let B(z,r’), r’ > 0 be a small ball with 
centre in z such that G = B(z,r’) MA C D (since D is open in A). All 
the terms of the subsequence {z'")} are not in G, in particular they 
are not in B(z,r’). But this last conclusion contradicts the fact that 
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z\") — z. Thus, our assumption that {f(z"”)} is not convergent to £(z) 
is false and so, f is continuous at z. Since this z was arbitrary chosen, 
we get that f is continuous at all the points of A. 


The following result is very useful in many situations of this course. 
It appears as a direct consequence of the above theorem. 


THEOREM 57. Let A be an open subset of R”, let a be a fixed point 
of A and let f : A — R be a continuous function on A such that 
f(a) > 0. Then there is an open ball B(a,r) C A, r > 0, with the 
property that f(x) > 0 for every x in B(a,r). 


ProoF. Takee > 0 such that f(a)—e > 0 and take the open subset 
Y = (f(a) —«, f(a) +) of R. Since f is continuous, X = f~'(Y) is 
an open subset of A which contains a. So, there is a small ball B(a,r) 
such that B(a,r) C X, ie. f(x) € Y for any x in B(a,r). But, for 
such x we have that f(x) > f(a) — « > 0 and the proof is done. 


REMARK 21. In the same way one can prove that f : A > R™ is 
continuous if and only if f carries back the closed subsets of IR™ into 
closed subsets of A (define this notion by analogy!). To prove this, one 
can use the last theorem 56. 


Not always a continuous function f : R” — R” carries a closed set 
of R” in a closed set of R™. For instance, f : R — R, f(x) = aeee 
carries the closed set [0, oo) into (0,1), which is not closed more. It 
is interesting to see that the closed set [0, co) in unbounded. If one 
tries to substitute it with a closed and bounded interval, for the same 
function, we shall not succeed at all to find like an image a non closed 
set! Why? Because of the following basic result: 


THEOREM 58. Let C' be a compact (closed and bounded) subset of 
R” and let f : C — R” be a continuous function. Then, the image 
£(C) of C, in R™, is also a compact subset there (in R™). Moreover, if 
m = 1, supf(C) = f(z,,) and inf f(C) = f(z,,), where zy, Zm are in 
C. 


PROOF. We need to prove that: a) f(C) is bounded and, b) f(C) 
is closed. The ideas used for proving this theorem are exactly the same 
like those used in the particular case (m = 1,n = 1) of Theorem 32. 
We take them again here. 

a) We assume that f(C) is not bounded. This means that for every 


n = 1,2,..., one can find a point x) in C such that rx) Sih 


(why?). Since C' is a compact subset in R”, we can find a conver- 
gent subsequence {x‘'")} to the point x of C (see Theorem 53). Since 
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f : C — R™ is continuous, the sequence {f (x) is convergent to 


f(x). But |px*)| > k, and k, — oo, so, the numerical sequence 


{||e*) } is unbounded (goes to oo!). We shall see that this is a 
contradiction. Indeed, 


fx] < fee) — ¢0)|| + IEG. 


If we take limits in this last inequality, we get: oo < 0+ ||f(x)||, which 
is not possible! The contradiction appeared because we supposed that 
f(C’) is unbounded. Hence, it is bounded, i.e. we just proved a). 

b) We use now the closeness test (Theorem 51) for proving that f(C) 
is closed. Let us take for this a convergent sequence {f (y™)}, with 
terms in f(C’) and with its limit c in R™. We have to prove that this c 
is also in f(C). Since C' is a compact subset of R”, there is a subsequence 
fy) 1 of the sequence {y”} such that y'"”) is convergent to y € C. 


Since f is continuous, the sequence {f (yy is convergent to f(y). But 
any subsequence of a convergent sequence is also convergent to the same 
limit of the whole sequence. Thus, c = f(y) and so, c € f(C), what we 
wanted to prove. The other statements can be proved exactly in the 
same manner (see also Theorem 32). 


Let us give a nice application to this last result. We can assume 
that the surface of the Earth is closed and bounded in the 3-D space R® 
(why?-you can take it for easy to be 9 = {(2,y,z): 2? +y?+27 = R*}, 
...a sphere of radius R, etc.; prove that S is closed and bounded!). At a 
fixed moment, to any point M(x, y, z) from the Earth we associate its 
temperature T(x, y, z) at that moment. Thus, we obtain a continuous 
function T’ defined on the compact surface of the Earth, with values in 
R. Applying the above theorem, we always can find two points on the 
Earth in which the temperatures are extreme. 

Let C be a compact (closed and bounded) subset of R” and let 
f : C — R” be a continuous function. Then, the norm ||f(C)|| of the 
image f(C) of C, in R, is also a compact subset there (in R). Moreover, 
sup ||f(C)|| = ||£(z)|| and inf |[£(C)|| = ||f(y)||, where z and y are in 
C. Firstly, the function 


g:R” >R,g(x) =|xl], 


is a continuous function. Indeed, let {x‘”)} be a sequence in R™, which 
is convergent to x. Since |||x‘)|| — |[x||| < ||x) — x|], we see that the 


sequence {g(x'”) = {||x ||} is convergent to ||x|| , i.e. g is continuous. 
Secondly, let us consider the composition g of : C' — R between the 
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continuous functions f and g. It is a continuous function (see Theorem 
55) and we can apply the last theorem (do it slowly!). 


REMARK 22. The condition on the closeness of C' in the above 
theorem (Theorem 58) is necessary as one can see in the example: 
f:(0,1]) ~R, f(z) = 1; this function is continuous (prove it!), the in- 
terval (0, 1] is bounded, nonclosed and the image f((0,1]) = [1,c) 
is not bounded, so not a compact subset of R. If C is closed but 
not bounded, its image through a continuous function f may be non- 
closed and nonbounded at the same time. For instance, C' = [1,0o), 
f(z) = +, 80, f(C) = (0,00), which is neither closed (it is open 
in R), nor bounded. This theorem above is not true in general metric 
spaces. Because a compact subset C in a general metric space (X,d) is 
defined "by sequences". Namely, C is a compact subset of (X,d) if any 
sequence in C’ has a convergent subsequence with its limit also in C. 
This is not generally equivalent to "bounded and closed". The exam- 
ples are two "exotic" and we do not give them here. In a metric space 
(X,d) we can introduce the "distance" between two compact subsets A 
and B of X. Namely, 


dist(A, B) = inf{d(a,b):a¢€ A,be B}. 


Since d is a continuous function this number dist(A, B) is always fi- 
nite and it is realized, i.e. there are ag in A and bo in B such that 
dist(A, B) = d(ao, bo). For instance, the distance between the full square 
A = 0,1] x [1,2] and the disc B = {(x,y) : (x — 2)? +y? <1 is V2-1 
and it is realized at ag = (1,1) € A and at bp = (2 - wot 75) (why?). 
It is easy to prove that the distance between two compact subsets A and 
B is realized on their boundaries (which are also compact subsets), i.e. 


dist(A, B) = dist(B(A), B(B)). 


Can you organize the set of all compact subsets of X as a metric space 
(with the distance function defined above)? 


In practice, the above Theorem 58 can be applied to optimization 
problems. For instance, let us find the maximal and the minimal values 
of the function f : [0,1] x [0,2] — R, f(z,y) = z*+ y*. Since C = 
[0, 1] x [0,2] is a compact subset in R? (prove it!), Theorem 58 implies 
that its image is a compact subset of R. So, sup f(C) = f(a) and 
inf f(C) = f(b). It is easy to see that a = (1,2) and b = (0,0) (the 
function is increasing relative to x and y, separately). 

An useful notion in the integral computation (and not only!-see the 
bellow application) is the notion of "uniform continuity". 
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DEFINITION 22. Let A be a nonempty subset of R” and let f: A — 
R™ be a function defined on A with values in R”. We say that f is 
uniformly continuous on A if for any small quantity « > 0, there is 
another small quantity 6: > 0 (depending on €) such that whenever we 
have two points x' and x” in A with the distance ||x' — x"|| between 
them less then 6-, the distance |£(*’) — f(x’)|| between their images is 
less then €. 


The word "uniform" reefers to the fact that here the continuity is 
not defined at a point, but on the whole A. Moreover, the variation 
||£(x') — £(x")|| of f(x) is uniform relative to the variation ||x! — x”|| 
of x. Thus, if we want that the variation of f(x) to be less than 0.001 
(| f(x’) — f(x’)|| < 0.001) in the case of an uniform continuous func- 
tion f, we can find a constant 6 = 60.0, > O such that anywhere 
a’ and a” would be in A, with the distance between them less than 
this last constant 6, we are sure that the corresponding variation of f, 
||£(a’) — f(a’)|| is less then 0.001. 


REMARK 23. The notion of uniform continuity is stronger then the 
"simple" continuity. Indeed, let f : A — R™ be a uniformly continuous 
function on A and let a be a fixed point in A. We shall prove that f is 
continuous at a. For this, let {a} be a convergent sequence to a in A. 
We want to prove that the sequence {f (a”)} is convergent to f(a) by 
using only the definition of the convergence. In fact, we want to prove 
that the numerical sequence {d(£(a™), f(a))} tends to zero. Now we 
use the usually Definition 1. For this, let ce > 0 be a small positive real 
number. Since f 1s uniformly continuous, there is a d- > 0 such that 
whenever ||x' — x"|| < 6.2, one has that 


f(x’) — f(x") || <e. 


Let us take now x" to be a and x’ = a”, with n > N, this last N 
chosen such that |ja” _ al| < 6-. Thus, 


fa) — f(a) 


ize 


whenever n > N and so, we have just proved that the sequence {f(a‘”’)} 
is convergent to f(a), i.e. f is continuous at an arbitrary chosen point 
a. 


But continuity does not always imply uniform continuity. For in- 
stance, f(x) = Ina, x € (0,1], is a continuous function and not a uni- 
formly continuous one. Indeed, let the sequences z/, = 4 and x” = ~. 

" 1 


It is clear that |z},—2,| = 5, — 0, but [Inz, —Inz?| = In2 ~ 0. 


3. CONTINUOUS FUNCTIONS ON COMPACT SETS 131 


Thus, if we take e < In2 in Definition 22, we can NEVER find a small 
d- > 0 such that for all pairs (2’, 2”) with |x’ — x”! < 6, one has 


[Inav’ —Ina”|<e <In2. 


To see this, let us take no large enough such that 


For the pair (z/,.,7"" ), 


Zing) no 
! ml i 
[In a’, — In x | = In; 


which is greater than ¢, so the definition of the uniform continuity does 
not work for this function. 

The next result says that for the functions defined on compact sets, 
continuity and uniform continuity coincide. Pay attention, in our case 
above (0, 1] in not compact! This is way we could prove that f(x) = Inax 
is not uniformly continuous. 


THEOREM 59. Let C be a compact subset of R” and let f: C — R™ 


be a continuous function defined on C. Then f is uniformly continuous 
on C. 


PROOF. We suppose on contrary, namely that f is not uniformly 
continuous on C’. We must carefully negate the statement of Definition 
22. Thus, there is an €g > 0 such that for any small enough 6 > 0 there 
is at least one pair (x,x4) with elements in C' such that ||x5 — x#|| < 6 
and 

\)£( x;) — f(x;) 1 > Eo. 
In particular, let us take for these 6, 6, = ; for k= 1,2... « Like 
above, for such 6;, k = 1,2,..., one can find two sequences {x’“)} and 
{x"*)} with |[x’ — x”) < ? and 


|p) — f(x") )| Sig) 0. 


Since C is a compact set, we can find two subsequences: {x!")} of 
{x1 and {xO} of {x (why can we take the same k; for both 
subsequences?) such that these both subsequences are convergent to 
the same limit y € C’ because 


|x = x(k) | eZ + (hh 


Since f is continuous, one has that the both sequences {f (x/F9)y and 
{f(x" (*))) are convergent to the same limit f(y). So the distance be- 
tween the corresponding terms becomes smaller and smaller as n — 00, 
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fee") — £¢") 


io 


f(x’) — £(x*) 
to €9. Thus, our assumption on the nonuniform continuity of f is false. 
Hence, f is uniformly continuous. 


a contradiction, because | is always greater or equal 


This result is very useful in practice. For instance, the function 
f(x) = nz is uniform continuous on any closed interval [a,b] C (0,00). 
Indeed, |a, b] is a compact subset in the definition domain (0, 00) of f, 
f is continuous on [a,b] and so we can apply the above Theorem 59. 


EXAMPLE 13. Let C be a 3D-object (C C R?), bounded and con- 
taining its boundary OC, like usually in practice. We know that C' is 
closed if and only if it contains its boundary OC. Let us assume that at 
any point M(x, y, z) of C we have a density f(x,y, z). Itis commonly to 
suppose that the density function f : C — R is a continuous function. 
The above theorem and our hypotheses on C' say that f is uniformly 
continuous. We cannot practically work with this function because no- 
body gives it us in advance. But we can perform some measurements. 
How do we perform such measurements f(x;, yi, Zi), 7 = 1,2,...,n, such 
that if we chose a point M(x, y, z) in C, we can find ig with 


\F(2, 952) if PBigs ince, )\ <eé 
(this is a small positive real number which controls the error, for in- 
stance € = 1/1000). Since our function is uniformly continuous, there 
is a small 6 > 0 such that whenever the distance between two points 
x’ = (x',y’,2') and x" = (x",y",2") of C is less than this 6, we have 
that 


ge cerre z) = fey; Z| <€. 
It remains to us to divide the body C into subbodies C;, 1 = 1,2,...,n, 


such that C = UG, and the diameters 


w; = sup{||x’ — x” || : x’,x” € C;} 


of C; are less then 6. Let us choose now a fixed point M;(xi, yi, 1) in 
each C; for i = 1, 2,...,n. Then the approximation 


icra z) ~ ties 2) 
is a good one if M(x, y, z) € C;. This means that 
ioe z) = f (i, Yas 2i)| <E. 


Thus, we can perform measurements of the density function values only 
at some arbitrarily chosen points M; in each C;. 
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We give here a very useful result, in a more general setting (define 
and prove things slowly!). 


THEOREM 60. Let X and Y be two compact metric spaces (recall 
that a metric space is compact if any sequence of it has at least one 
convergent subsequence) and let f : X — Y be a continuous bijection 
from X on Y. Letg: Y — X be its inverse. Then g is also continuous. 


PROOF. Let us prove that g carries back closed subsets of X into 
closed subsets of Y (see Remark 21). Let C be a closed subset of X 
and let E = g-1(C) = f(C). Since X is compact, C is also compact 
(prove it!). Since f is continuous, F = f(C) is compact, so E itself is 
closed in Y (prove it!). Hence, g is continuous. 


COROLLARY 7. Let f be a strictly monotone continuous function 
which carries the interval [a,b] onto the interval |c, d] (see also the next 
section, Darboux’ theorem). Then f is inversable and its inverse g is 
also continuous. 


PROOF. Since f is strictly monotone it is one-to-one (injective). 
Since both intervals are compact metric spaces, we simply apply the 
previous result. Here, "onto" means surjectivity!. 


4. Continuous functions on connected sets 


Let A be a subset of R”. A continuous curve in A is a vector con- 
tinuous function y : J — A, defined on an interval J, finite or not, 
opened or not, closed or not. In fact, we think of the image y(J/) of 
the interval J through -y. Let M(21, ®2,...,%,) be a point in A. We say 
that y passes through M if there is tg in J such that y(t.) = M. 


DEFINITION 23. We say that the subset A of R” is connected if any 
two points M, and Mz of A can be connected by a continuous curve, 
i.e. if there is a continuous function y:I — A and t,,t2 € I such that 
(ti) = M, and y(t2) = M2. This means that y passes through M, and 
Mo. 


REMARK 24. An interval I of R is a subset of R with the following 
property: if a,b € I and x is between a and b (a < x < b), then 
x is also in I. In R, the connected subsets are exactly the intervals 
of R. Indeed, let I be a connected subset of R, let a,b € I and let x 
witha <a <b. Since I is connected, let y : J — I be a continuous 
curve which connect a and b. This means that there are ty and te in J 
such that y(t1) = a and (tz) = b. We can restrict y to the interval 
[t1,t2] C J and apply Darboua property for the continuous function y 
(see Theorem 88). Hence x = y(t3), where tz © [t1, te]. Sox € TJ; 
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thus I is an interval. Conversely, let I be an interval in R and let x1, 
xo € 1. Let y: |%1,%2| — I be the identity mapping. This is obviously 
a continuous curve which connect x1 and x2. 

THEOREM 61. Let A be a connected subset of R” and letf : A — R™ 


be a continuous mapping defined on A with values in R™. Then the 
image f(A) of f in R™ is also a connected subset of R™. 


PROooF. Let f(x) and f(y) be two points in f(A), x,y € A. Since 
A is connected, there is a continuous curve y : J > A and two points 
a,b € I (an interval in R) such that (a) = x and -+(b) = y. Now, the 
composition fo7y : J > R” is a continuous curve with (fo) (a) = f(x) 
and (f o y)(b) = f(y). Thus f(A) is a connected subset of R”. 


This is a fundamental result in different practical exercises. For 
instance, let 


S={(2,y,z) eR? : 2° +y? +2? < BR} 


be the 3D-ball of radius R with centre at origin. Let f : S — R be 
the functions which associates to any point M(z, y,z) the sum of these 
coordinates, namely 


f(z,y,z) =a2+yt+z. 


Let us find the image of S through f. Since S is connected (in fact S is 
a convex subset of R3, i.e. for any pair of points L, P of S, the segment 
[L, P| is contained in S) and since f is continuous, its image in R is a 
connected subset (see Theorem 61), i.e. it is an interval (see Remark 
24). In fact, this image is a closed and bounded interval because S' 
is a compact set (way?) and f is continuous. So it is of the form 
[m, M] where m = inf f(S) and M = sup f(S). To find m and M is 
not an easy task. We only remark that the points where it is realized 
the greatest and the smallest values must be on the boundary OS of 
S, namely where x? + y? + 2? = R? (otherwise, if a point H(a, b,c) of 
extremum, say a maximum, was inside the ball, not on the boundary 
OS, then we can gently increase (or decrease) one of the values a, b, or 
c, such that the new point L obtained in this way belongs to the ball 
and, in it the function f has a greater value then the value of f in H). 
In a later section (Conditional extremum points) we shall see how to 
compute m and M. 

The above theorem is helpful in proving the following useful result 
(this result provides the basis of for different algorithms for solving 
algebraic equations). 


THEOREM 62. Let f : [a,b] — R be a continuous function such that 
f(a) - f(b) < 0. Then, there is a point c in (a,b) such that f(c) = 0. 
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This means that the equation f(x) =0 has at least one solution in the 
interval |a, 5]. 


PROOF. The set f(a, b]) is an interval (see Theorem 61 and Remark 
24) which contains f(a) and f(b). Since f(a) - f(b) < 0, the numbers 
f(a) and f(b) have distinct signs. Since f({a, b]) is an interval and since 
0 is between f(a) and f(b), 0 must be also in f([a, b]). This means that 
there is a c in [a,b] such that f(c) = 0. Since f(a) - f(b) < 0, this c 
cannot be neither a nor 0, so c € (a,b). 


REMARK 25. In fact, the statement of this last theorem is equiv- 
alent with the statement of Darboux Theorem 3838. Let us prove for 
instance that the above last theorem implies Darboux Theorem 83. Let 
i ae) = f(x,) (see Weierstrass Theorem 32) and M = 


sup f(x) = f(x2). Let choose a number X € (m,M) and let consider 
x€[a,b] 


the auxiliary continuous function g(x) = f(x) — A. Let us take now the 
interval [x1, %2|* (here £ means that |%1, v2|* = [x1, Le] if 71 < rq and 
[v1, 2)" = [t2, 21] of to < 41; if v1 = X2 our function is constant and 
one has nothing to prove). Since g(x1)-g(x2) < 0 (if one of the factors 
is equal to 0 we also have nothing to prove more!), Theorem 62 says 
that there exists a number c € (a,b) such that g(c) = 0, te. f(c) =A 
and Darbouz Theorem is proved. Conversely is very easy (prove it!). 


We can use Theorem 62 in order to find approximative solutions for 
an equation f(x) = 0 in an interval [a,b], on which the function f is 
continuous (find a counterexample to this theorem in the case when f is 
not continuous). We also assume that f(a)- f(b) < 0. Let us divide the 
segment |a, b] into two equal parts and chose that one |a1, b:| for which 
f(ai)- f(b) < 0 Gf f(ai1) = 0 or f(b1) = 0, c = a; or c = by and we 
stop the process). Let us repeat the same with the subinterval [a;, bj] 
instead of [a,b], and so on. If we cannot find a, or b,, n = 1,2,...., 
such that f(a,) = 0 or f(b,) = 0, the solution c is (the unique point) 


in the intersection A [an Pn (why?). So, for a small error indicator 


€ > 0, if we take np such that ana < ¢€, then the approximation c & ay, 
(or c & b,,) lead us to an error less then ¢ (why?). This is in fact 
the description of a very known algorithm in Computer Science for 


constructing approximative solutions for a large class of equations. 
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5. The Riemann’s sphere 


In Fig.6.3 we have a sphere S of radius R > 0 and with center at 
the origin O(0,0,0). Its equation is 


(5.1) pay? +b 2? = Re 


Fig. 6.3 


We know that the subset 
S={(a,y,z): a? +y? +2? = RB} 


is a compact subset of R® (it is closed and bounded, why?). Since B. 
Riemann used this model for explaining the "compactification" of the 
usual complex plane C (identified here with the coordinate plane rOy), 
we call S the Riemann sphere.We call the point N (0,0, R), the north 
pole of S' (see Fig.6.3). Let us associate to any point M(x, y, z) of the 
sphere S, the point M’(a,b,0) in the plane xOy (= C), obtained by 
intersecting the line NM with the plane rOy (see Fig.6.3). Since for 
N we cannot associate in this way a point in rOy, we say that there is 
a one to one correspondence between S \{N} and C. Let us denote by 
f:S\{N}—C, the mapping M ~ M"', or f(M) = M’. It is not so 
easy to express a and 0 as functions of x, y, z. If we think of a sequence 
{M,,} of points on S, which is convergent in R® to M, it is easy to see 
that the sequence {M/} is convergent to M’ in C. So f is a continuous 
function on S \ {N}. As in the case of the "compactification" of R 
by adding of the symbols {+00} (since in R= RU{+oo} any sequence 
has at least one convergent subsequence-why?-it is a compact metric 
space!)) we take a symbol "oo" outside C and consider C = C U {oo} 
with some obvious algebraic operations: 7 + 00 = 00 +4%7=00, TEC, 
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|oo| = oo (this is the symbol +oo from R), etc. If we extend now the 
function f to the whole sphere S by putting f(V) =oo, we obtain a 


bijection between the Riemann sphere and C. We say that a sequence 
{Zn} of C is convergent to oo if |z,| + 00 € R. So this f is invertible 
and f~' is also continuous. In particular Cisa compact metric space, 
the least compact metric space which contains C (why?). This is why 


one can also call C the Riemann sphere. For instance, a "ball" with 
centre at oo is the exterior of an usual closed ball with centre at O 
and of radius r > 0: {(z,y,z) : a? + y?+ 22 > r?}. The notion 
of Riemann sphere is very important when we work with functions of 
complex variable. Intuitively, co can be realized as the circumference 
of a "circle" with center at O € C and of an infinite radius. So, the 
fundamental "e-neighborhoods" of 00 are of the form {z € C: |z| > R}, 
where F is any positive (usually large) real number. We finally remark 
that the metric structure on S' is that one induced from R?. 


6. Problems 


1. Say if the following sets are open, closed, bounded, compact or 
connected. In each case, compute their closure and their boundaries. 
Draw them carefully! 


a) 
{(2,y): a7 +y? <9}; 
b) 
{(z,y) : 2? +y? > 9}; 
c) 
{(2,y) ia? +y? = 5}; 
d) 


{(z,y):«@ € [0,1);y € (1, 2]}; 


{(z,y) : 2+ y = 3}; 


f){(q,0) : ¢ € Q}; g){(0,2) : n = 1,2,... }; D{(z,y) : y? = 2e,2 € 
[0, 1)}; i) ae 
iG: ”) b= 12. ey 


{(t,y,z):t+y+z <3;2,y, z € [0,00)} 


{(x,y, 2): xa € [-1,1],y € (0,4], z € (—3, 5]} 
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l) {2 EC: |z— 23| < 3}; m){z € C: |2z +3] < 6}; n) 
{z€C: |z+3-— 2i| > 4}; 


{zE€C:z=r+iy,¢2 =2,y < 3}; 


P) 

{z€C:2< |z-2| < 4}; 
q) 

{zE€C:|z—3+4 2i| > 2}; 
r) 

{f € C[0, 2] : || fll < 2}; 
) 

{f € C0, 27] : || fl] = 3}: 
u) 


{f € C0, 27] : || f — sinz|| < 0.3} 


1 1 
—3.3):g-—=< — 
{fECl-33):9-Z<f<g+z, 
where g(a) = x, g(x) = —2, or g(x) = x*}; w) 
{f € C[0, 1] :2< If —gll < 4}, 
where g(x) = 2; y)D = {(z,y) : In(a?+y? —4) /(x+2y) is well defined}. 
2. Compute the limits of the following sequences: 


‘ x = ( 1 n-1 as Ae), 


Qn+1’3n+4’ n 
b) 
i 
x) = J/n-1 ie | 
Yn—-Wn-1 14+n 
c) 
3+ 2in 
n yb = =i 
n+ 2% 


d) zy = (1+ 4)"; e) z, = exp (in + +); 

3. Starting with the definition of continuity and of uniform con- 
tinuity, determine what of the following functions are continuous and 
what are uniformly continuous. 

a) f(x) =sinz, x € [0,7]; 

b) 


Oe @+y—),2 E [L.2l.y € [3,4]; 
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c) f(x,y, 2) =x —y, where x? + y?+ 2? = 4; d) f(z) =2,2€ 


139 


(0,2). 


4. Some of the following limits exist, some do not exist. Say (and 
prove!) which of them exist and compute them in the affirmative situ- 


ation. aes 
i ery’ t1 : aay 
a) a oe 9) 28 aia ‘eat gay a 
i 1 
eat 0) ae (Hint: Pip ESS etc.); 
d) 
2 4,2 
lim SE 
(x,y)—(0,0) |x| + |y 
- ; wh lel, 
(Hint: Ea pean Saks etc.); ots a, -f) lim z 2a lim 


h i fu 
de Pe ae we 
i) 
2 
iii. 
(x,y) (0,0) x? + y4 
(Hint: use (4,0) and (4, 2)); 


n2? n 


5. Conipute, if you can, the following directional limits: 


J--b) lim 2aty 


* x 
a lim — 
) we +ty2? et tae ro+y2? 


r—0,y=mx 
c) 


lim # exp(—(x + y)); 


L000, y=Mz LF 


d) 
: Ss 
Pe be ta i oY exp(x +y ye 
6. Compute: 
lim ( I 1+ xyz, cos(z + y+ :)) 
(x,y,z) 0 \ v2 + y2 +1? ; 


and explain everything you did, step by step (small steps!). 


7. Study the continuity of the following functions: 


a) 


f:R-R, f(z) = 
if x € Qand f(x) = 0, if x ¢ Q (Dirichlet’s function); 
b) 


f:R-R, f(z) = 
ifz €Q, and f(r) = —-2, ifr ¢Q; 


c) 
hie: R= R, f(z) = exp(—2), 
ie O-and fe) = sing i aS 0; 


exp(—|z|)—1, 
r ) 


6. THE NORMED SPACE R”™. 


f :R’ > R’, f(x,y) = (2,0); 


:R? SR, f(e,y) = d((2,y), (0,0) = Va? + ¥5 


Ty 
rae R? R’, f(z,y) — (ow), 


if (x,y) 4 (0,0) and f(0,0) = (0,0); 


g) 


pe R? R, f(t, y) = ty = 


if (x,y) Z (0,0) and f(0,0) = 0; 


h) 


if (x,y) F (0,0) and f(0,0) 
8. Prove that f(x) 


f: RR, f(z,y) = 


0. 
? is uniformly continuous on [0,1], but 


x 
it is not on the whole R (Hint: use xr, = Jn, Un4i — Ln — 0, but 


f (tau) — f(tn) = 1+ 0) 


9. Prove that f(x) = <3 is uniformly continuous on [1,2], but not 


on R 


1 


10. Let (X,d) be a metric space. Prove that, for any fixed a in X, 


the mapping f,(z) = d(x, a) is a uniformly continuous function defined 


on X with values in R 


11. Let f: A> R, f(a, y,z) =x +y-+ z, where 


A= {(z,y,z) ER? :1< 2? +y+27 <4}. 


Prove that f(A) is a closed interval in R. Find it. 
12. Do the same for 


f(z,y) =2+y,2 € [1,2], y € [1,2]. 


CHAPTER 7 


Partial derivatives. Differentiability. 


1. Partial derivatives. Differentiability. 


Let A be an open subset in R, a a fixed point in A and let f: A— R 
be a function defined on A with values in R. Let B(a,r) = (a—r,a+r), 
r > 0, be a small ball (an open interval in our particular case) of radius 
r and with centre a, which is contained in A. Let h be a small quantity 
such that a+h € B(a,r). We call this h an "increment" of a in B(a,r) 
(or in A if one takes h with a+h € A). The difference f(a+h) — f(a) 
is called the increment of f at a, corresponding to the increment h of 
a. So, here appears a new function y, ;(h) = f(a+h)— f(a). This new 
function depends on a and on f. It is defined in a small ball, (—<,¢), 
which contains 0 as its centre and of radius ¢, (at most r (why?)). The 
description of this last function is important in the case we want to 
evaluate the variation of a phenomenon around a given point a. For 
instance, if a worker has his salary a and if his salary increases with h, 
what is the increment f(a+h)— f(a) of his family educational level? We 
say that the increment f(a +h) — f(a) is approximately linear around 
a, if 
(1.1) flath) — fla) =A(a, f) b+ he wap(h), 
where wa, is a function of h defined on (—¢,¢€), wa, (0) = 0 and 
Wa,p(h) + 0, when h — 0 (i.e. wWa,f is continuous at 0). Here A(a, f) is 
a real number which depend on f and on a. 

The birth of differential calculus began with the following result. 


THEOREM 63. With the above notation and hypotheses, the incre- 
ment of f is approximately linear around a if and only if f is differen- 
tiable at a and, in this case f'(a) = A(a, f). Thus, 

(1.2) f(at+h)— f(a) = fila) -h+h- wash). 
Hence, 
flath) — fla) & fila) -h 
and the error h- wWa,p(h) ts a zero O(h) of h, i.e. 
h- Wa,p(h) 
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PROOF. Let us divide by h the equality (1.1) and make h — 0. We 
obtain that the limit 


lim 


h—0 


f(at+th)— fla 

FAO) _ Na, f) 
So, if the increment f(a +h) — f(a) is approximately linear around a, 
f is differentiable at a and f’(a) = X(a, f). Conversely, let us assume 
that f is differentiable at a. Then, if one construct 

fia+h) — fla 

(1.3) wag(h) = FOF D= LO) — pay, 
it is easy to verify that this function w,,, is continuous at 0 and it is 
zero at h = 0 (do it!). If we take now for (a, f) the number f’(a), and 
for Wa,r the function constructed in (1.3), we obtain the formula (1.1), 
i.e. the increment of f is approximately linear around a. 


Let us evaluate the increment of f(x) = —2? + 3x — 7 at a = 10 if 
the increment h of a is 0.5. We simply apply formula (1.2) and find 


f(10 + 0.5) — f(10) = f"(10) - 0.5 +. 0.5 - wp 490.5) © 8.5. 


DEFINITION 24. With the above notation, the linear mapping df (a) : 
R — R, defined by 


df(a)(h) = f"(a) +h, 
is called the first differential of f at a. This one exists if and only if 
the first derivative f'(a) of f ata exists (why?). 


Thus, 
df(a)(h) = fla +h) — fla), 
i.e. the value df(a)(h) of the first differential of f at a, computed 
in the increment h of a, is approximative equal to the corresponding 
increment 
flat+h) — fla) 

of f ata. 

Before extending the notion of a differential to a vector function we 
need some other simpler notion. 

Let A be an open subset of R", f : A — R”, a vector function of 
n variables, defined on A with values in the normed (or metric) space 
R™ and a = (a1, Q2,...,@n) a point in A. We write f = (fi, fo,..., fm), 
where fi, f2,...; fm are the m scalar component functions of f. For the 
moment we take m = 1 and write f = f, like a scalar function (with 
values in R). Let us fix a variable 7; (j = 1,2,...,n) of the variable 
vector 


x= (21,22, very D5, Vi, Vi 4A oy Ly): 
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For this fixed j, let us define a "partial function" y,; of f at a. For this 
we fix all the other variables x1, £9, ...,¥j-1, 241, --; 2n (except xj) by 
putting 


1 = Q1,%Q = Ag, ..-, Vj-1 = Aj_-1, Vj 41 = Aj4i, +++; Un = An 
and let us leave free the variable x; in 


f(x) =f (ists very Uj-1, Uj, Uj4+1, stn 


i.e. we define 

(1.4) p(t) = F(1, Gay --, Qj; b, Ogs4, +5 On); 

where t runs over the projection pr;(A) of A along the Oj-axis, where 
PEND Doi tig Din Ogi, Pg 


DEFINITION 25. With the above notation, if the function p, is dif- 
ferentiable att = a;, one says that f has a partial derivative yi (a;) with 
respect to the variable x; at a and we denote this last one by ae (a). 


The mapping x ~> f(x), x € A, is called the partial derivative of f 
J 


with respect to x;. 


Practically, if we want to compute the partial derivative of a scalar 
function f of n variables 


1, U2, 0, Vj-1, Vj, Vj41, +++, Un, 

with respect to x;, we think of the other variables 
Ly LQ, ey Vj—-1, Vj41, ++) Un 
like being constants (parameters, or "inactivated" variables) and we 
perform the usual differential laws on the "active" variable x;. Ifn = 1, 
we usually denote x; by x. If n = 2, we usually denote x; by x and x2 
by y. If n = 3, we usually denote x, by x, x2 by y and x3 by z. For 
instance, let 
f(x,y) =sin’(a* + y°) 

be defined on R? and let a = (0, ¢/5) be the fixed point at which we 
want to compute the partial derivatives of f (with respect to x and 


to y respectively). Let us use the definition to compute $F (a). In our 
case, 


y(t) = sin?(t? + 5) 
and 


ay . 34? 


ol (t) = Qsin(t3 + 5) -cos(t? + 5 
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(we just used the chain rule for computing the derivative of a composed 
function of one variable). Now, 
Of T 
—((0,"/=)) = y;,(0) = 0. 
£0,4/2)) = (0) 
Let us compute now 
Of 


(1.5) By y)) = 2sin(x® + y?) - cos(x? + y?) - 3y? 
Here, we simply considered that the initial function depended only 
on y and we looked at x like to a constant. If we want to compute 
2£((0, °/F)), we simply make « = 0 and y = 3/F in the general expres- 
sion (1.5) of 2£((x,y)). Thus, 2£((0, 6/5)) is also 0. Since both partial 
derivatives of f at (0, ¢/5) are zero, we say that this last point is a 
stationary (or critical) point. 

If f is a function defined on an open subset A of R” which has 
partial derivatives with respect to all its variables at a point a, we 
define the gradient vector of f at a by the formula: 


grad f(a) = (Feta) ola), a )) . 


We say that a is a critical (stationary) point for f if grad f(a) = 0. 
The gradient is the direct generalization of the notion of "velocity". 

We know from any course of "Linear Algebra" that a mapping T : 
R” — R” is said to be a linear mapping if T(x + y) = T(x) + T(y) 
and T(ax) =aT(x) for any x,y in R” and a in R. For instance, if 
T : R — R is linear, then T(x) = x7(1) for any x € R. Hence, 
T(x) = Ax (A = T(1)!) for any x in R. If T : R” — R is linear then, 
by taking 


x= (21, £2, sg he) = %1e; + V9@o +... + nen, 
where e, = (1,0,0,...,0), e2 = (0,1,0,...,0), ....€n = (0,0,0,...,0, 1), 
we get that 
T(x) = %1T(e1) + coT (eg) +... tOnT (Cn) = A1T1 + A2Ta +. Ann, 


where \; = T(e;) for any i = 1,2,...,n. It is easy to see that if 
T,,72,...,Im are the component functions of T, then T is a linear 
mapping if and only if all the component functions 7), 7», ..., Ti, of T 
are linear (prove it!). 


THEOREM 64. Any linear mapping T : R” — R” is a continuous 
vector function of n variables. 
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PROOF. It is sufficient to prove that any component function T;, 
i =1,2,...,n of T is continuous (see Theorem 54). This means that we 
can reduce ourselves to the case of m = 1, i.e. to the case of a scalar 
function T’: R” — R. Let 


{e, = (1,0,0,...,0), €2 = (0, 1,0,...,0), ..., én = (0, 0,0, ...,0,1)} 


be the canonical basis of R”. This means that any vector x = (1, £2, ...,2n) 
can be uniquely represented as: 


X= %1@; + Vo@o+... + LHEn. 
Let us denote 
ay, = T(e1), a2 = T(ea),..., Un = T(€n). 
These are fixed real numbers. Hence, 
D(x). = LAG Tey We) ) = Ot a yO. 
If 
x(™ — ( (on) ) al™) — X = (21, 2o,..-; Ln), 
when m — oo, then, 


(m) (m) m 
Vy 7 %%,U%9 — £9,..., 08 ) 7 Xn, 


when m — co (componentwise convergence). Thus, 


T(x™) = May + eM ag +... tay 2101 +... + ZnQn 


which is just T(x). Hence, 7’ is a continuous mapping. 


REMARK 26. Let us define the associated matrix of 
Peale ly) 


by ts = Tle) fore = 12 mand fp S12 ane so the matrr 
A = (aij) is am x n matrix with entries in R. If we compute now 


T(x) ||? = Tix)? + To(x)? +... + Talx)? = 


n 2 n 2 n 2 
i=1 i=1 i=1 


n n n n n n 
Sate t Da Dat + Dat > are = lll’ AI’, 
i=1 i=l i=l i=l i=1 i=l 


where we recall that 


|| All = 
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Thus, 

(1.6) \| T(x) |] < |All |x - 

From here we can easily directly prove the continuity of T (do it!). 


Now, we come back to the definition of the linear approximation of 
the increment f(x +h) — f(x) of a function f around a point a, ina 
general situation. 


DEFINITION 26. (Frechet) Let D be an open subset of R” and let 

a be a fixed point in D. Let f : D — R be a function defined on D 

with values in R. We say that f is differentiable at a if there is a linear 

mapping T, = T : R" — R and a continuous scalar function y(h) 

which is continuous at 0 =(0,0,...,0), defined on a small ball B(O,r) C 
—S 


n— ee 
R", r > 0, y(0) =0 with lim Tar 1 Tat = 0, such that 
(1.7) f(a+ h)—f(a) =T(h) + ¢(h). 


This means that the increment f(a+h)—f(a) can be linearly approz- 
imated by the linear mapping T (which depend ona and on f ) around 
the da . up to a function y(h) which is a zero of h (O(h)) of order 
1 (lim 5 a = 0). The linear mapping T is called the (first) differential 


ney at a. We write it as df(a). Hence, formula (1.7) becomes 
(1.8) f(a+h)—f(a) =df(a)(h) + y(h). 


REMARK 27. It is clear that f is differentiable at a if and only if 
there is a linear function T : R" — R such that the following limit 
exists and it is zero: 


(1.9) lim 


Indeed, if (1.9) is true, then p(h) = f(a+ h)—f(a)—T(h) is continu- 
ous at O and its value at 0 is 0. If it were not continuous at 0, there 
would be an ¢ > 0 such that 


|f(a+h)—f(a)—T(h)| > 
for any small values of h — 0. So, 


[flat h)—fla)-T(h)| Se 
[|| [ha 


when h > 0. Hence (1.9) could not be true, a contradiction! 


7 Ww, 
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Shortly saying, f is differentiable at a if it can be "well" approx- 
imated on a small neighborhood of a by a formula of the following 


type: 
(1.10) f(at+h) xf(a)+T7(h), 


where JT is a linear mapping and h is a small increment of a. This 
last interpretation is very useful in Physics and in Engineering when a 
phenomenon is "linearized". 

The next big problem is how to compute this 7’ in language of f and 
a. But, first of all, let us use only the definition and the remark above 
to "guess" the differentials for some simple functions. For instance, if 
f has only one variable, we find again Definition 24. If f is a constant 
function, then df(a) is the zero linear mapping (prove this!). The first 
differential of a linear mapping JT : R” — R is T itself (why?). In 
particular, the 7-th projection pr; : R” — R, 


pri(hi, ha, seey hi, seey Hi) _ hi, 


is differentiable and its differential pr; is denoted by dx;, or dx, dy, dz 
in the 3D-case. So 


dy(1, 2, —3)(3, 1, —7) = 1, dz(a1, ae, a3)(—2, 3,5) = 5 
for any a = (4, 2, a3). 


THEOREM 65. If f is differentiable at a € D, where D is an open 
subset of IR”, then f is continuous at a. This means that the property 
of differentiability is stronger then the property of continuity. 


Proor. Let {a‘”)} be a sequence of vectors in R” which is conver- 
gent to a and let h™ =a) —a (— 0). Then 


f(at+ h™) = f(a) +df(a)(h™) + o(h™) 


(see (1.8)). Since df(a) is a linear mapping, it is continuous (see The- 
orem 64), so 


lim df (a)(h™) =0. 


Since lim sar = 0, one has that lim y(h)) = 0 (why?). Hence, 
sath) = f(a), 


when n — oo. 


THEOREM 66. The linear mapping T = df(a) is uniquely deter- 
mined by f and a. 
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PROOF. The proof of this result is implicitely included in the state- 
ment of the next theorem (see Theorem (67). However, we give here 
another proof. 

If there was another one U such that 


(1.11) f(a+h) — f(a) =U(h) + 9, (h), 
where yy, (0) = 0, y, is continuous at 0 and lim ee = = 0, we can write 
that 


T(h) + y(h) = U(h) + 9; (h) 
for all h in a small ball centered at origin. Moreover, 
_ (1—U)(h) _ |. (a) — (ha) 
(1.12) lim ———— = lim 
hoo {hl ho | hl| 
We want to prove that for any x in R” one has T(x) = U(x). We assume 
contrary, namely that there is a xp such that (—U)(xo) 4 0. Ift > 0 is 
small, then txo is small, i.e. it is close to 0, because ||txo|| = t ||xo|| — 0, 
when t — 0, t > 0. Let us come back to (1.12) and write 
mL — V(t) = km! ‘(2 —U)(Xo) =P: 


0 |[txoll 0 t- |[xoll 


= 0. 


So, (7 —U)(x9) = 0 and we just obtained a contradiction. Hence, there 
is no Xp with (T — U)(xo) 4 0 and so T =U. 


Thus, if we find a method to compute T' = df (a), this T is unique. 
It depends only on f and on a. 


THEOREM 67. If f is differentiable at a, then all the partial deriv- 


: of of of 
atives Dei? Begs? Gas exists at a and 


Of 


(LASY <df(a)(hy, fey, te) = ae | a + .. 4 Dg, a)hns 
or, using the projection pr; = dx; notation (see Remark 27), we get 
O O O 
(1.14) df (a) = OL die, + —— u (a)dxy+.. Sale ! Fe engme 
Ox, Ox 


Moreover, if f is of class C' on a ball B(a,r), for a SS > Une: 
f € C'(B(a,r)) (this means that f has partial derivatives with respect 
to all variables x1, X2,...,%n and all of these are continuous on B(a,r)), 
then f is differentiable at a and formula (1.14) works. 


PROOF. We suppose that f is differentiable at a and let T = df (a) 
be its differential at a. We know from Linear Algebra or from the proof 
of Theorem 64 that 


T(h1, ho, teeny An) = Ah, + Agha + see + pee 
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where 4, Az, ...,An are fixed real numbers (recall that A; = T(e,), 
where e; is the i-th vector of the canonical basis of R”, etc.). Let us 
chose now a j in {1,2,...,n}, let us take y > 0, close to 0 and let us 
also take 


in formula (1.9). We get 


fm 2 AQ, «++, Aj-1, Aj ale Ys Qj41y very An)—f(a)—yAj 
70 Y 


= 0. 


Since this limit exists, the partial derivative with respect to 7 exists and, 
from this last formula we get that 5 (a) = ;, for any 7 € {1, 2, ..., n}. 
Hence, 

_ Of 
i Ox1 
and the first part of the statement is completely proved. 


Let us now assume that f is of class C! on a ball B(a,r),r > 0. 
Let us take the following linear mapping T': R” — R: 


2 OF 
- Ox, 
Let us prove that this T’ is indeed the differential of f at a. To be easier, 
let us also assume that n = 2. Then, we want to prove that 


(1.15) lim f(a + hi, a2 + he) — f(ai, a2) — T (hi, he) 
oe see Ly 


T (hy, ha, ..., hn) (a)hy 


Ei bay clin) 


(a)hi + 


= 0. 


Let us write: 


f(ai + hi, ag + he) — f(a, 42) = f(ai + hy, a2 + he) — f (a1, a2 + he) 


(1.16) +f (a1, @2 + he) — f(a1, a2). 
Now, let us consider the function 
gi (t) = f(t, a2 + he), t € far, a1 + Aal* 


and let us apply to it Lagrange’s formula: 


O 
(1.17)  f(ay + Ai, dg + he) — fai, ae + he) = Fes, + hg) - ha, 
1 


where c; € [a@1,@1+h,]*. Let us do the same for f (a1, a2+h2)— f (a1, a2) 
by considering the function 


p(t) 7 f(a, t),t € [a2, a2 + ha]*. 
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We get 


(1.18) f(a, a2 + hz) => f(a, ag) = a1, C2) ho, 


Of i 
Ox, 
where C2 € [a2, ag+h2|*. Let us come back in (1.16) with the expressions 
of (1.17) and (1.18). So, 


f(a + hi, a2 + he) — f(ai, a2) — T (Ai, ha) 


(1.19) 
= |54 (C1, G2 + hz) — at a) hyt+ ES (a1, C2) — 


Since the function f is of class C! in a small neighborhood of a = 
(a1, @2), one has that: 


0 
C1, dg + he) — ia mee =O; 


ba 

Ox, Ox, 
when h — Oi.e. hy — 0 and hy — 0 and 
ar 
Ox2 


— 0, 


(area) — GE (ar, ¢2) 


when h — 0. Since 
[Ail [Ral 


hI) (hi) ~ 


one has that the limit in (1.15) is zero (do this slowly, step by step!). 
Hence, f is differentiable at a and its differential has the usual form: 


df (a) = (ade + (airs 


For an arbitrary n the proof is similar, but the writing is more compli- 
cated. 


This last theorem is very useful in computations. For instance, let 
f : R? —R be defined by 


flesy,2) = e+ 2? yo + 2°): 
All the partial derivatives 


Of 2x Of Ay? 
Ox 1t+a2+y24+ 26 Oy 1+22+y44 26 


and 


Of _ 62° 
dz 1t+a2+y4+ 26 
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exist and are continuous on the whole R®, in particular around the 
point (1,—1,2). Applying the last theorem (see Theorem 67) we see 
that f is differentiable at (1,—1,2) and 

_ Of Of Of 


1e=iy2 =—(1,=1,2 (lel 2)\da= 
Fp lls 1 2)der + S(L, 1, 2)dy + 5° (1,1, 2)de 


df(1, =; 2) 


2 4 
= “de — ! 
a art er 
Recall a basic fact: df(1,—1, 2) is NOT a number, but a linear mapping 


from R® to R. For instance, 


df (1, = 2)(3, —4, 0) a 


2 4 192 
= =A) Se = — —4,0) = 
2 4 192 22 
=2 3-2 -(-4 4S 0-5. 
67 67 ae) 67 67 


This last one is a real number because df(1,—1,2) : R? — R is a linear 
mapping. 

We want now to extend the notion of differentiability from scalar 
functions of n variables to vector functions. 


DEFINITION 27. Let f : D — R™ be a vector function with its com- 
ponents (fi, fo,---; fm), defined on an open subset D of R” with values 
in R™. We say that f is differentiable at a € D if all its components 
fi, fa, 5 fm are differentiable at a like scalar functions. Moreover, if 
h = (hy, hg, ..., kn) is a vector in R” and if 


df ;(a) (h) =a;h, + Aighe +o. Gillin: 


where 


_ Ofi 


ai = 


_ Of _ Of 
Or, (a), aia ~~ Ox (a), see) Qin i OXn (a), 


then the matrix 


Ofi 
Jae = (ai; = F(a) 


with m rows and n columns is called the Jacobi (or jacobian) matrix of 
f ata. The linear mapping T : R" — R™ defined by the jacobian matrix 
Jas (with respect to the canonical bases of R” and R™ respectively) is 
called the differential of f at a. We write T = df(a). The determinant 
|Jae| of Jas, in the particular case n =m, is said to be the jacobian of 
f ata. 
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For instance, 


f: D> R’,D=({(z,y,z) € B®: 2 > 0,y > 0,z > 0}, 
defined by 


1 
F(x, Y; z) a (—. vs) 


is differentiable at any point a =(a,b,c) of D because its components 
1 
fi (a Y; z) ae 
LYZ 

and 

fo(a, Y; z) = LYZ 
have this last property (why?). 
1 1 
df\(a) = —~—dzx 


abe — abc a abc? 


Since 


dz 


and 
dfg(a) = be: dx + ac- dy + ab: dz, 


the jacobian matrix of f at a is the 2 x 3 matrix 


ey ee _ i 
a?bc ab2c abe? | | 
be ac ab 


For instance, if a = 1,b = 1 and c = —2, we get the numerical matrix 


i. @ ot 
2 2 4 
@ -2 1 ) 


Now, if we want to compute the value of df(1,1,—2) : R® — R? at the 
point (3,4,—5), from Linear Algebra or from the remark 26, we get 


3 
ae ee ee 
a ee 6-8-5 —19}” 


sO df (1, 1, 2)(3, 4, —5) =F (2, 219) 


REMARK 28. One can prove that f : D — R” is differentiable at a 
point a €D C R” if and only if there is a linear mapping T : R” — R™ 
which depends on a such that the following limit exists and is equal to 
zero: 


(1.20) fm Wf(a + b) = fla) — T(h)|| 


= 0. 
h—0 [hi 
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We recall that 


||f(a + h) — f(a) — T(h)|| = \ >— [f(a +h) — fila) — Tia]? 


i=1 


and everything reduces to the scalar component functions, for which we 
know this result. 
This above statement is equivalent to say that the increment 


f(a +h) — f(a) 


of our vector function f at a, corresponding to the increment h of a, 
can be "well" approximated by the value of the liner function T ath (do 
this slowly, step by step!). The uniqueness of the above T is obvious 
because its components are uniquely defined, being the differentials of 
some scalar functions, the components of f. 


EXERCISE 1. Let f,g: D— R", be two differentiable functions on 
D (at any point of D), where D is an open subset in R" and let be 
a real number. Then: f +g, f — g, fg (only for m = 1) : (only for 
m= 1 and g(a) £0), Af, are also differentiable on D and 

a) 


d(f + g)(a) =df(a)+dg(a); 


b) 
d(f — g)(a) =df(a)—dg(a); 


d(fg)(a) = g(a)-df(a)+f(a)-dg(a); 
ja Glalaile) = flay agla), 


g g(a)” 
e) d(Af) = A-df forX ER. 


d( 


In c) and d) f, g are only scalar functions! 


2. Chain rules 


Let A, B be two open subsets of R and let a be a point in A. Let 
f : A— B bea function defined on A with values in B such that f is 
differentiable at a. Let g: B — R be a differentiable function at f(a). 
Then the composed function go f : A — R is differentiable at a and 


(90 f)'(a) = 9 (F(a) F*(@) 
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(the simplest chain rule!). Indeed, 
am did (2) — 9(F@)) _ 


xr—a x—-a 


tn MFO) =HFO) 5 LO=LO — pay pq 
= it yet EB =I) 0) 


So (go f)'(a) exists and is exactly g'(f(a))- f(a). In particular, if f is 
invertible and f~! is differentiable at b = f(a) then, from f~'(f(x)) = 
x, we get f-"(b)- f"(a) = 1, ise. f(b) = phy, o (F(F(@)) = py. 

We want now to generalize this simple chain rule to vector functions. 
Let us start with a simpler case, namely, let us take a "curve" f : A > 
B,f = (fi, fo, -.-, fn), where A is an open subset in R and B is an open 
subset in R". Let g : B — R be a differential function at b = f(a) 
and let us assume that f is differentiable at a. Leth = gof: A—R 
be the composition between g and f, i.e. the restriction of g to the 
n-D "curve" f (to the image of f in the common language!). Then, the 
following result is fundamental in applications. 


THEOREM 68. (differentiation along a curve) With the above nota- 
tion and hypotheses, 
Og 
(2.1) (go f)’(a) 


~ Oxy 


(Ela) Sosa) a (Ea) Fale) ss 


For n = 1 we find again the above formula (g 0 f)'(a) = g'(f(a)) - 
f(a). 


PROOF. To be easier we take the particular case n = 2 and we 
assume that f and g are functions of class C! on A and B respectively. 
Whenever we write limit of something or the derivative of a function, 
be sure that we implicitly prove that this limit or this derivative exists 
(prove this slowly in what follows!). 

In this case, h(x) = g(fi(x), fo(x)) for any x € A. So, 


h(x) = ha) _ 5. gl fala), fale) = ol fala), fol) 


me) ee i ee ra uate) = 
bs — tim file), fale) = 91 fila), fal), 
tim S(f1(@), fol)) — 9 fila), fala) 


20 r—a 
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Let us consider the first limit in (2.2) and let us apply Lagrange’s 
formula (see Corollary 5) for the mapping t — g(f1(t), fo(z)) on the 
interval [a,x] (or [x, a] if x < a). We get 


(file), fal@)) — g( fila), fala) = Sno) fol(x)) - file): (w@—a), 


where c is between a and x. Here we used our chain formula for n = 1 
(where?-explain!). Coming back to the first limit in (2.2) and using the 
fact that oh , f, and f2 are continuous, we get: 


tn 904142), fa) — (fila), falz)) _ 1 OF 
wa x—-a ra0ry 
0 
= 5, (fila), fala)) - f(a). 

We take now the second limit in (2.2) and apply Lagrange’s formula 
for the mapping t — g(fi(a), fo(t)) on the same interval [a, x]. We get 
I fi(4), folx)) — 9(fi(@), fo(a)) = oH h(a) fo(s)) + fo(s)) - (e— a), 

Og 


where s is a number between a and x. Since Bay? fo and f§ are con- 


tinuous (by our restrictive hypothesis in the present proof!), we obtain 
that 


(filc), fal) - file) = 


on ful); fol) = afl), (@)) _ 5, 29 | 
lim —4 = Him (f(a), fals)) - f3(9)) 


= 9p, fila), fala) - fala), 


thus our formula (2.1) is completely proved for n = 2. 


The statement of the theorem is true without these restrictions 
made here, but the proof is more sophisticated. 

If the curve f : R — R? is a line which passes through the point 
Mo(2o, Yo, 20) and having the direction of the versor 


u = (cosa, cos 3, cos y) 


(these cosines are usually called the directional cosines of the line), i.e. 
f(t) = (v9 + tcosa, yo + tcos G, 2 +tcosy), then, the above derivative 
Og Og 
(9 S f)'(0) a = lho, Yo, 2) COS @ +> a (20, Yo, 2)) cos B+ 
Ox, Ox, 
Og 
+a (0, Yo, 20) cosy = (grad g(Mo), u) , 
3 
(a scalar product!) is called the directional derivative of g at the 
point Moy along the versor u. 
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For instance, if u = (1,0,0), we get the partial derivative of g at 
Mo with respect to 71, etc. 

We can now immediately extend the formula (2.1) for the case of 
a vector function g: B > R™, g = (91, 92, ---; 9m). Thus, for any fixed 
j € {1,2,...,m}, one has 
(2.3) 


(G08) (a) = SB (E(a))- Hla) + $2 (€(a)) Ma) ++ 52 (Ea): Ha). 


If we use now the matrix language, formula (2.3) becomes 


(g1 0 £)’(a) 
(92 0 £)'(a) 


(2.4) 
(Im 0 £)"(a) 
pe (E(a)) 58 (6(a)) per(F(a))\ fi (a) 
H2 (F(a) 52(E(a)) ae. (F(a)) | | f(a) 
20m (F(a) 20m (F(a) i dom (F(a) fi(a) 


Up to now our function f was a function of one variable t. Let us make 
the last generalization and consider a vectorial function f of p variables 
t1, ta, ...,t) defined on an open subset A of R?. So we have the following 
composition: A +, B = R™. We denote by h= gof: A — R” and 
preserve the notation x = (21, %,...,2,) for a point (vector!) in R”. 
Thus, 


Pla dogste) =i aiyborss tas fo bis tos ane ta i cin In Eiylas cate) 


and 


g(x, LQ, vey Den) => (gi (x1, DQ, oes a sag Gis (Os DQy vey Lays 


Let now a be a fixed point of A, a = (a1, 2,...,a,) and b = f(a). We 
assume that f and g are differentiable at a and at b respectively. 


THEOREM 69. (chain rule theorem) With these notation and hy- 
potheses, the composed function h = gof is differentiable at a and 
one has the following relation between the corresponding jacobian ma- 
trices : 


(2.5) J aeet = Jig : dat: 
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This is the most sophisticated chain rule. Moreover, in this case, Linear 
Algebra says that 


(2.6) d(go f)(a) =dg(b)odf(a), 


this last composition being the composition between the corresponding 
linear mappings. 


PROOF. Formula (2.6) is a direct consequence of formula (2.5) and 
the basic result of Linear Algebra which says that there is an isomorphic 
bijection between the m x n matrices and the linear mapping T' : R"” — 
R™. This bijection carries the product between two matrices into the 
composition of the corresponding linear mappings. Hence, it remains 
us to prove formula (2.5). We shall see that this formula is a pure 
generalization of formula (2.4). Indeed, let us fix 7 € {1,2,...,p} and 
let us consider the mapping 


po : Ai Bp = (ol, 99), .... p®) 


defined by 
t~ f(a, QQ, ---, Ai-1, t," Qj415 +++) Gy) 
It is defined on the i-th projection A; = pr;(A) of A (which is again 


open-why?). Let us denote h® = go y™ and let us write formula (2.4) 
for it: 
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[A] (ao 
les | (ai) 


[2] (as) 
(9) 0° )"(as) = a) 


for any 7 = 1,2,...,m andi = 1,2,...,p. Here h = (hy, ho,..., hm) are 
the components of the composed function h = go f. 
Another remark is that 


52 (9 (as) = S2(E(@) 


Ox, Or, 


qf 
and 1%" | (a;) = 24(a). But, if we substitute all of these in formula 


We now see that 


Ot; 
(2.7), we get exactly formula (2.5) from the statement of the theorem. 


REMARK 29. It is possible to prove the chain rule theorem, namely 
the formula (2.6), in a not so long "upgrading" way. But that proof (see 
[Nik], or [Pal]) is more abstract, more elaborated and not so natural. 
Our proof here is not so general, but it follows the natural historical 
way, from a "simpler" to a "more complicated" case. 


Let us take an usual situation and let us apply formula (2.5) to it. 
Let A and B be two open subsets of R? and let (x, y) ~> (u(z, y), v(z, y)) 
be a differentiable (at any point of A) vector function defined on A 
with values in B. Let f(u,v) be a differentiable function defined on 
B with values in R. Here we also use wu and v for the coordinates of 
a free vector in B C R?. The only connection between u,v and the 
functions of two variables u(x, y) and u(x, y) respectively, is that the 
variable u and v are substituted with two functions u(x, y) and v(z, y) 
respectively, in variables x and y. For instance, u = «+ y, v = ry and 
f(x + y,xy). This is a new function in x and y. Here, u(z,y) =x+y 
and u(x, y) = ry. This abuse of notation is still working for more then 
200 years and it did not caused any damage in science. Let h(x, y) = 
f (u(x, y), v(x, y)) be the composition between f and the first function 
(x,y) — (u(z, y), v(x, y)). This new function is also denoted by f, i.e. 
the notation f(x,y) = f(u(z,y), v(x, y)) produce no confusion for an 
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working mathematician (another abuse, which is not indicated to be 
used by a beginner!). The function h is also differentiable on A and 


(22 (a, b) 5, (4, b)) = 


Se(a,b) S¥(a,b) 
of U\a U\a os U\a U\a . oe By . 
(su (u(a,b),v(a,b)) 3 (u(a, b), va, b))) (eo Hey) 


Let us normally write this formula: 


(2.8) 
Oh of Ou af a 
7a b) a 7 ula b), v(a, b)) a (a, b) I By (ula, b), v(a, b)) FA) (a, b), 
Oh of uu Of 
By b) = Fy Ula, 6); vla, b)) a (a, b) r By (Ula, 6), v(a, ba (a, b), 


How do we recall these useful formulas? For this, write again 


hla) = Fala), vey) )., To And gh we look at the variables u 


and v of f and observe where z is. If x appears in u = u(x, y), we take 
the partial derivative of f w.r.t. wu and multiply it by the partial deriv- 
ative of u w.r.t. x. Here is a "chain": f — u — x. So we get ol . ou 
If x also appears in v = v(x, y), we consider the chain f — v — x and 


obtain we : oe Since x appears both (if it is the case!) in wu and in v, 
we must superpose both "effects" (add them!) and finally obtain: 
h 
(2.9) O _ Of du | Of Ov 
Os: Ou 02° Ov Ox 


The corresponding points at which we compute these partial derivatives 
are easy to be find. If we change x with y in (2.9) we get the second 
essential formula of (2.8): 


Oh Of Ou Of dv 
dy Ou dy dv dy 


(2.10) 


EXAMPLE 14. In the Cartesian plane {O;i,j}, we consider a heat- 
ing source in the origin O(0,0). The temperature f(x,y) at the point 
M(a,y) verifies the following equation (a partial differential equation 
of order 1— a PDE-1): 


It says that at any point M(x,y) the "gradient" vector 


O O 
gradf = (She Y); seu) 
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of the temperature is perpendicular to the normal vector of the position 
vector OM = xri+ yj, at the point M(x, y). Hence, gradf is colinear to 


OM. Let us change the variables x and y withu =x and v= 2x? +y?. 
The new function h(u,v) is connected to f by the rule: 


f(x,y) = h(a, 2° +y"). 


So, 
Of _OhOu | OhOv _ Oh | 5 Oh 
dr Oudr Ovdsz Ou du 
and 
Of _OhOu | OhOv _ . Oh 
dy Oudy  Ovdy Yau 
Hence, 
gm yeh = Ply 5 ary — any — yO 
~ Fag Oy ~ Fu ay Yau Ou 
Hence, whenever y # 0, oh = 0 is the equation in the new function 


h. Soh is a function of v = x* + y”, the square of the distance up to 
origin. Thus, the temperature is constant at all the points which are of 
the same circle of radius r > 0. We say that the level curves (f(x,y) = 
constant) of the temperature are all the concentric circles with center 
at O. 


We must apply the "spirit" of the formulas (2.5) or (2.10), not the 
formulas themselves. For instance, let 


£(0,y,2) = (sin(x? + y?),cos(2z”),2? + y? + 24), 


Then, 
Of Of 
— = (2x cos(x* + y”),0, 2x), — = (2ycos(x? + y”), 0, 2y) 
Ox Oy 
and 
Of 
ae (0, —4z sin(2z”), 2z). 
If we want to compute oF (1, —1,7) we simply put z = 1,y = —1 and 
z = 7 in the expression of oF So, 
Of 
ae —1,7) = (2cos2,0, 2). 


Here cos 2 means the cosinus of two radians. 
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EXAMPLE 15. Let M(x(t), y(t), z(t)), t is time, t € (a,b), a > 0, be 
a moving point of mass m = 5Kgq on the curve 
Pig= z(t),y = y(t), 2 —= z(Z). 
Let 


and 
w(t) = (2"(¢), y"(@), 2"(¢)) 


be the velocity and the acceleration respectively. We assume that the 
kinetic energy 


does not depend on time, i.e. T’(t) = 0. Let us use the chain rule to 
make the computation in this last equality: 


T(t) = 5 {le "+ YOM] + HOMe"O]t = 9, 


i.e. the scalar (inner) product between v and w is equal to zero. In this 
case, the acceleration is perpendicular on the velocity. This restriction 
is very useful in physical considerations. 


DEFINITION 28. A subset K of R” is said to be a conic subset if 
for any x in K and any t € R, one has that tx € K (see Fig.7.1). 


fe) 
K is the whole IR ifn = 1 


For instance, 


K =R",K ={(a,y) €R’: y = maz}, 
where m is a fixed parameter (real number)}, 


K ={(a,y,z) €R?: 2? +y’ = 27} 


are conic subsets (prove it!). 
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DEFINITION 29. Let f : K — R, be a function defined on a conic 
subset K CR” with values in R and let a be a fixed real number. We 
say that f is homogeneous of degree a if 


(2.11) FD DO oo bi) =F io: aon 


for any x = (1, %2,...,2n) in K and for any t in Ry. 


For instance, the distance to origin function 


d(x,y,z) = fxr + y? + 2? 


is a homogeneous function of degree 1. Indeed, 


d(ta, ty, tz) = »/ (tx)? + (ty)? + (tz)? = tx? + y? + 2? = td(a, y, 2). 
L. Euler introduced these functions when he studied the mechanics 
of a moving point in plane. For a = 0, we simply call these functions 
homogeneous. Euler discovered a very useful property for homogeneous 
functions. In the following we consider a generalization of the Euler’s 
result. 


THEOREM 70. (Euler formula for homogeneous functions) Let K 
be a conic open subset in R” and let f be a function of class C' on K, 
which is homogeneous of degree a. Then, 
Of Of 
Ox1 Ox OL; 


PROooF. By the definition of a homogeneous function (Definition 
29), we may look at the formula (2.11) and differentiate everything 
w.r.t. t (here we use the chain rule...explain slowly this...) 


On ar 
Ox, - es OL 


We now make t = 1 in this last formula and obtain Euler formula 
(2.19); 


(2.12) xy 


(x) + x (x) +..+2n 


(tx) = at?" - f(x). 


L1 (tx) +...4+2n 


If a = 0, ie. if our function is homogeneous, Euler formula can be 
written as 


(2.13) (x, grad f(x)) =0. 


Here (,) is the (inner) scalar product in R”. This last formula (2.13) 
says that at any point x of the trajectory of a moving point in R”, 
the gradient (a generalization of the velocity for n variables!) of f is 
perpendicular on the position vector x. For instance, we know that the 
temperature T(z, y) in any point (2, y) of the plane R? is the same for 
all the points of an arbitrary line y = mz, where m runs freely on R. 
This means (in mathematical language) that T(tx,ty) = T(x, y) for 
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any (x,y) € R? and any t in R, (why?). So, the temperature is a 
homogeneous function and we can write the Euler’s formula for a = 0, 
i.e. (x, grad T(x)) = 0, where x = (z, y) and 


ad To) = (low) Fea) 


Finally we get the following PDE of order 1 : 


OT OT 
wa (x,y) I VB, 9) — 0, 


i.e. in any point the gradient of the temperature is perpendicular on 
the position vector (x, y). 

In exercises, one usually asks to verify Euler’s formula for a given 
homogeneous function f. For instance, let us verify Euler’s formula for 
f(a,y,2) = ryz +323 + y?. We do not know yet if the function f 
is homogeneous and, if it is so, we also do not know the homogeneity 
degree of it. Let us put instead of x, y and z, tx, ty, and tz respectively: 


f (tz, ty, tz) = Bayz + 322 + y*) = Bf (az, y, z). 


Thus, our function is homogeneous of degree 3. So we have to verify 
the following formula: 


(2.14) 


y t Z 
Oy Oz 
Indeed, i = yz + 92?; ot = rz + 3y? and of = ry. Substituting in 
(2.14), we get: 


a(yz + 9x”) + y(xz + 3y”) + zry = 3(xyz + 32° + y*) = 3f. 


Hence, we just verified Euler’s formula for our particular function. 


3. Problems 


1. Compute the following partial derivatives: 


2 
0 oe 
fla,y) = VP SEY), (1) 


I. ; Of .« Of 7 4 
= 2 Zs 


Flo, y) = In(e +9? — 1); S40, 1) 


b) 


Of 


’ Bye 1), 


164 7. PARTIAL DERIVATIVES. DIFFERENTIABILITY. 


eke oP om 
F(a,y) = xexpley)s 5 (1,0), £4(1,0), 5(1,0) 


Of Of 0? f 


— ,lny Pane poe 
fey) =2™"@ > 0.y > 0), (ee). (ere) a (ee): 


Meg 2)\Sa" (0,9 > 0)erad fC, 1.1): 


Of arf arf 
alee y) —— arctan vY, OyOx? (1, 1), OxOy? (1, 1), 9a3 hs 1). 


H 
F(e.y) = aresin( =), (1,2), 


2. Prove that the following functions verify the indicated equations: 


a) 


(0, y) = ey (a? — v2), y= + oy = (0? + yz 
Ox Oy 
b) 
Loe. Toe. 2 
= r(x? — y”); | ——- 
2(x,y) = cO(a" —y ar sone 
c) 
y def O'u OP 
ule; y) arctan = Au = a2 -- Dye =; 
d) 
nu Ou 
u(x,t) = ®(x — at) + U(x + at); aD aa =0 
(the wave equation). 
e) 
y G3 072 Orz 3072 
= ! +2 | =0. 
f) 
1 af Pu Pu OPu _ 


— -A — | | — 
WO) Tag Batt Oye t Oe 


Hint: Let us denote r = \/x? + y? + z?. Then, ou = —4 . or etc. 


3. PROBLEMS 165 


3. Show that the Euler’s formula is true for the following homoge- 
neous functions: 


a) f(x,y) = 

b) 
f(a,y, 2) = Vet Jyt v2; 
f(la,y,2z) = Jer ty? + 2; 


d) f(x,y, 2) = Zexp(2). 
4. Prove that the following function 


=’ for (2, 0,0 
0, if =O andy =0 


c) 


is continuous, has partial derivatives, but it is not differentiable at (0, 0) 
(Hint: —Z < |y], so 


Vert? 


vy of Of 
] =0 0,0) = = (0,0) = 0. 
ae eran 5, (0,0) By | 0) 
If it was differentiable at (0,0) one has that 
0 0 
(3.1) f(h,ha) ~ F(0,0) = 54 (0,0)hr + S20, 0)hs + (hh), 
Ox Oy 
where w(0,0) = 0, w is continuous at (0,0) and 
tie 


But, from (3.1), one has that w(x, y) = Teap and so one would have 
vty 


that 
vy 


i = = 
e0,y0.0? + y? 
However, this last limit does not exist at all!!). 


CHAPTER 8 


Taylor’s formula for several variables. 


1. Higher partial derivatives. Differentials of order k. 


Let a be the partial derivative with respect to x of a function 


f : A > R, where A is an open subset in R*. (2,y) ~~» S4(2,y) is 
a new function of two variables x and y. If this new function has a 


partial derivative Z(2)(a, b) w.r.t. x, at a point (a,b), we denote it 
by TL(a, b) and say " d two f over d x two at (a, 6)". If the same 


function (x,y) ~ Sf (x,y) has a partial derivative FEV, b) w.r.t. 


y, at a point (a,b), we write it as ££ (a,b) and call it the mixed 


derivative of f at (a,b). What do we mean by oo (say "d three f 


over d x d y two"; pay attention to the fact that 3 from 0° is equal to 
the sum between 1 and 2, from Ox and Oy? respectively). In general, 
let f : A— R, f(x1,22,...,2n) be a function of n variables, defined 
on an open subset A of R”, such that it is k,-times differentiable with 


: kn : : ; 
respect to @p, i.e. a exists on A. If this new function 


Ome 
Oxkn oe 


Diet Ce fy ee 8 ea 
is k,_1-times differentiable with respect to 7,_1, the new obtained func- 
tion 

Okn-1 Okn 
aN oe zl (x) 
OF 5 Oxkn 


okntkn—1 f 


kn k 
0n,,"5 logkn 


is denoted by . And so on. We finally obtain the function 


kn thn —1te th ¢ 
dal Oe?5} axkn 
tor can be changed, but then we may obtain another new function. 

: : 4. yh, 35 of : 
For instance, if f(z, y,z) = x*y’z°, then Syawte, can be successively 
computed. First of all we compute 


. The order of variables 71, v2, ...,%, in the denomina- 
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Then we compute 


Og, OP 3.3.4 
92 Ox Ox0z a 
Now we compute 
Og2 orf 2a 
== = 60 
BS Oe ~ Gaz 4 
Then we consider 
Og3 otf 22.4 
= = = 180 
vA Oy  OyOx?d0z a 
Finally, 
5 
O94 Zt = 36027yz". 


on Oy ~ Oy?0x?0z 
And this last one is our final result. 


knt+ky—yt ky 
= 7 a is said to be the partial k = k, + kn_y +... + ky 
ak) ...Oah "> Oxy, 
derivative of f, k,-times w.r.t. %,, kyn_1-times w.r.t. %p_1,..., and ky- 


times w.r.t. 71. The mapping f ~ is also denoted by D,,f. This 


Dz, is a the partial differential operator w.r.t. the wariablé Li 
So, f~ me oF is the composition D,, 0 D,, applied to f. In general, a 
mapping defined on a set of functions is éalled not a function more, but 
an operator. We also put D,,.; instead of D,,°0D,,. Such an operator is 
called a differential operator. In general, the operators D;, and D,, do 
not commute if 7 # 7. a en that there are examples of functions 


f and points a for which (a) woe (a). Following [Pal], p. 145, 


5 em 
we consider 


vytrs, if (x,y) # (0,0) 
(1.1) f(z,y) = 
é Ee] hy). 


It is not difficult to prove that a ! (0,0) = —1, but FL (0, 0) = 1 (do 


it step by step and explain everything!). Hence, in this case we cannot 
commute the order of derivation! 

Let A be an open subset of IR” and let f : A — R be a function of 
n variable defined on A. We say that f is of class C? on A if all the 
partial derivatives of order two, xe (a), exist and are continuous, at 
any point a of A. The following theorem gives us a sufficient condition 
under which the change of order of derivation has no influence on the 
final result. 
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THEOREM 71. (Schwarz’ Theorem) Let f : A — R be a function of 
class C? on A. Then 


Oo f noe O f “ 
On ,0%; ~ Ox ,02; 
for any point a of A and for any pair (1,7). This means that for such 
a function (of class C? on A) we can commute the order of derivation. 


PROOF. One can reduce everything to the two variables case (why’). 
Moreover, we can take an open ball (disc) B(a,r), r > 0, a =(a1, a), in- 
cluded in A and consider f defined on this ball B(a,r). Let {(2n, Yn) } 
be a sequence of points in B(a,r) which converges to a. For a fixed 
natural number n let us consider the segments |[a1,%,] and [a2, yn] in 
B(a,r). Let 
(1.2) R(@n; Yn) = f (Ln, Yn) — f(2n, a2) — fai, Yn) + fai, aa) 
and let g(t) = f(t, yn) — f(t, a2), t € [a1, v7]. Let us apply Lagrange’s 
theorem (see Corollary 5) to function g on [a1, xp] : 

g(&n) — g(a1) = g'(Cn) * (@n = a1), 
where c, € [a1, Z,]. But 


and 


So, 


0 0 
ed Cane Yn) = cy, Up) a Sy, a2) (ee — a1). 


Now we apply again Lagrange’s theorem to the function 


u— 2 (Cn; t+); 


Ox 


where wu € |d2, Yn}. Hence, 


(13) RlCas tn) = FF (Ens da) (tn — 04)(Un ~ a), 


where d,, € [d2, yn]. Now we take a new function 
A(t) = Fast) — f(a, t), 
t € [ag, Yn| and observe that 
Agia) <a h(Yn) + h(ag). 
Let us apply Lagrange’s theorem to h on [a9, Yn] : 
(1.4) Fuels | = h'(€n) ‘ (Yn = a2), 
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where en, € [@2, yn]. But h’(e,) = B, Gis n) _ ZE (a1, €n) so, applying 
again Lagrange’s theorem to the function: 


— —(V,en), 
Vv A) ( ) ) 
where v € [a1, ,], we get: 


h'(€n) = Dnay €n) + (fn — a1), 


where 8, € [@1, Z,]. Hence, 


(15) Rat) = Gensen) - (On — aa) (the — 02) 
. ns Yn) = dxdy no &n n 1)\Yn 2): 
Comparing the formulas (1.3) and (1.5), we get: 

0’ f 0’ f 
(1.6) Byou on an) = Dray om or 


5 é Of Of é 5 
Since the functions 575; and 5,5, are continuous on A, since {Cn}, {$n} > 


a, and since {d,}, {en} — a2 (why?), from formula (1.6), we get: 
dyOx te cy cae dxdy 1, %2)- 


Hence, the proof of the theorem is complete. 


In (1.1) 
Of Of 
——(0,0) = —1 4 ——~—(0,0) = 1 
Fan (n0) =-1# (0,0) =1, 
because 2-4 js not continuous at 0,0). Indeed, 

OyOxr 

oF 78 —yS—9n274—15art4y2 

Byoe y) = : Care ©, if (x,y) 4 (0,0) , 


1, ifz=0,y=0. 


and this last function has no limit at (0,0). This is because, if we take 
an arbitrary m and consider (x,y) with y = mz, we get that 

i x® — 6 — Ox7y* — 15arty? 1—25m® 

im = : 

20,y=mx (ae? + y?)3 (1 + eh a 

which is dependent on m. So, the limit at (0,0) is not a unique number. 
It depends on the direction on which we come to (0,0). All of these 
happen because the function 


7 = y° _ Qar744 =o 15axty? 
(a + y?)8 
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is homogeneous of degree 0 (make clear this for yourself!) 

In engineering, the case of functions of class C? is mostly frequent, 
thus we assume in the following that the order of derivation does not 
matter. For instance, f(x,y) = 4x°y? + 2xy is of class C@ on R? 
(why?). In particular, it is of class C? because C® means that f has 
partial derivatives of any order (so these derivatives are continuous- 
why?). Schwarz’ theorem says that 


Oo f Ce Oy 
OxOy ae OyOu 
for any point (a,b) in R?. Indeed, 
Oxdy Od Be 


(a,b) 


0 
By 8) = pp (B08y + 20°) leon = 


= 2dr?y + 4x l(an)= 24a7b + 4a 


O 
—- (a,b) = —()(a,b) = —(122?y? + 4zy) |(a9)= 
Spank) = 5 Gr) b) = 5 (120%? + 409) lion 
= 2a?y + da l(a,8)= 24a7b + 4a. 

Sometimes is more convenient 520 change the order of derivation. 

For instance, f(x,y) = ae + y? + 1) is of class C® on R? (why?). 
oe? 

In order to compute at it is easier to compute ae i.e. to compute 


firstly 55 of — =p and secondly 
O ( Qry ) ae ee El) ey ay: Be ay? 
Oy \2P+y? +) (a2 + y? +1) (22+ y? +1) ’ 
then to compute firstly 
of 2y" 
eas 2 2404) 4 
Oy a ae ee maa | 
and secondly 
O Qy? 
a [in(a? +y° +1) 4 
Ox ie UTNE aaa 
(why?-count the number of operations and their difficulties in each 


case! ). 
The following notion will be very helpful in the applications of the 
differential calculus. 
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DEFINITION 30. Let A be an open subset in R” and let 
a = (1, d2,...,@n) be a fixed point (vector) in A. Let f be a function 
of class C? on A, f : A— R. The symmetric matrix 


0° f ; , 
Ara = (s;;) = Ind) b= 1, 2, see Mh J = 1,2, weey TL 
aaa 


is called the Hessian matrix of f ata. The quadratic form d? f(a) de- 
fined on R”, relative to its canonical basis 


{e; = (1,0,0,...,0),e2 = (0,1,0,...,0),...,en = (0,0, 0, ...0, 1)} 


(see a Linear Algebra course!) with values in R, 


(1.7) d? f (a)(hy, ha, ..., kn) = Birra (a)hjh,. 


is called the second differential of f at a. Its matrix is exactly the 
Hessian matrix of f at a. For instance, if f is a function of 2 vari- 
ables, 71 = 2%, 2 = y anda = (a,b), then formula (1.7) becomes 

oF OF Of 
(1.8) d’ f(a, b)(ha, hz) = By? bhi Toy (a, b)hihz + yz b)hi. 
If we introduce the projection functions dx;(hy, ho, ..., hn) = hy fori = 
1,2,...,n, we get a more compact formula for (1.7) 


(1.9) d’ f(a) = ya (a)dx,dx;. 


Here, dxj;dx; is the product between the two linear mappings dx;, dx; : 
R” — R, ze. 


where h = (hy, ho, ..., hn). For two variables we get 


O? f OoF OF 
Tel 2f(a,6)= 742 dy + (a, b) dy’ 
(1.10) df(a,b) = F(a, ide? + 25 (a, Diderdy + 5a, bay? 
where dx? is dx-dx and not d(x?) which is equal to 2xdx (why?). The 
same for dy”... . The analogous formula for a function of 3 variables 
f(x,y, 2) ts 
Of Of Ory 
iz See b 2 ds 2 yall: b 2 

d’ f(a, b,c) a2 (a,b, c)dx* + Op (a,b, c)dy* + oe (a, b, c)dz7+ 
(1.11) +2 eh (a, b, c)dady+2 of (a, b, c)dxdz+2 cai (a, b, c)dydz 

; Oxdy >” oT ona OyOz>?”’ pee 
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For instance, let us compute the second differential for 
f(z, y, 2) = 227 + 82y?z 4+ 23 
at the point (—1, 2,3). First of all we compute 


OF O Of 0 
72 toh 2) = Da Dg btw Zz) = 55 (Or + 3y?z) = 122. 
So, Ff (-1, 2,3) = —12. It is easy to find 
emi Of 
Aye eee a =13; gp2 (712,38) 7 18, 
of (219: 3) 36 cane 23) = 12 OT 4 73) = 12 
OtOys "ez (77% Byes 7 
Now we use (1.11) and find 
(1.12) 


d* f(—1, 2,3) = —12dx* — 18dy? + 18dz? + 72drdy + 24drdz — 24dydz, 


i.e. we have a quadratic form in 3 variables dx, dy, dz. Clearer, this last 
quadratic form is 


g(X,Y, Z) = -12X? — 18Y2 4+ 18724 72XY + 24XZ — 24Y Z. 


Now, if we substitute X with dz, Y with dy and Z with dz, we get 
G19), 
Let us compute the value of this last function 


d*f(—1,2,3):R® —R 
at the point (2,—3, —4). Since 
dx? (2, -3, —4) = 2? = 4, dy?(2, -3, —4) = (-3)* =9, 
dz?(2,-3, —4) = (—4)? = 16, drdy(2, -3, —4) = 2- (-3) = -6, 
dxdz(2, —3, —4) = 2- (—4) = —8, dydz(2, —3, —4) = (—3)(—4) 
we finally obtain 


a? f(—1, 2,3) (2, —3;—4) = —12-4 = 18-9 +18 -16 +72. (—6)+ 


| 
— 
Nw 


404s ($8) 94819 = 19 Ae] AS 4 24(18 = 8 19) 


= —12-447-184+24-(—38) = —12(4+76) +7-18 = 6(—139) = —834. 


Now, let us look carefully at the formulas (1.13), (1.7) and (1.9). 
We introduce some symbolic operations in order to find a unitary and 
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general formula. We called aon a differential operator. By definition, 
Z, 


we multiply two such operators aon and ef by a simple composition: 
] a 


0 0 at & a) 


On Ox; ~ Ox OX; - Ox; ‘ Ox; 


For instance, 


0 O 0,0 0 
(= : =) (3x7 + 5ay*) = Be By + 5ay°)) = 5g bey") = 15y’. 


Moreover, 


a) a) 
df (a,b) = a, b)dax + Sea b)dy 


can be written as an operator "on f" at an arbitrary point (which will 
not appear) 


O O 
d = —d —d 
On = Oy ie 
This is also called a differential operator. How do we multiply two such 
operators? 
O O O O 
(je stv) (Fa: xt) = 
def 0? oe Oo? 0? 
= dw. 
Ox0z gue OyOz aude OxOw oaee OyOw aye 


This means that whenever we multiply operators we just compose 
them and whenever we multiply linear mappings we just multiply them 
as functions. These last are always coefficients of differential operators. 
For instance 

a OV o oP 
1.13 —dzr+—dy}) = dz? +2 dxdy + — dy’. 
aay) (> Oy v) Ox? OxOy a Oy? 7 


Hence, 


d’ f (a,b) = (Fae + ow) (f)(a, b) 

’ Ox dy er) 
with this last notation. We observe that in (1.13) one has a binomial 
formula of the type (a + 6)? = a? + 2ab + b? (with the above indicated 
multiplication between differential operators). If we multiply again by 


2 dx + 2dy the both sides in (1.13) we easily get 
Ox Oy 


Dian TONG OP oth a TON ee We Oe oan Seas SO 2 
(Fe + say) = 730% res dy Saag aad : 


ie. the analogous formula of (a + b)? = a? + 3a7b + 3ab? + 6°. 
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DEFINITION 31. (the differential of order k) In general, if a func- 
tion f of n variables, f : A > R, is of class C* on A, i.e. it has all 
partial differentials of the type 

ony 
kia. k (a 
On 0x5 )..0nke 
(where k is a fixed natural number, k > 0 and ky, ko,..., kn are natural 
numbers such that k = ky + ko +... +kyn and 0 < ky, ko, ..., kn <n), at 
any point a of A, the k-th differential of f at a is by definition 


) ) ) ; 


For instance, if n = 2, x, = x, Z2 = y and a =(a,)), then this last 
formula becomes 


(1.15) 
a Ne k\  O*f see 
i 7(a,0) = (Sede + Fay) (lad) = (4) ries tae, 
i=0 
where (") =] Teo ni is the combination of k objects taken 7. The analogy 


with the binomial formula 


(a +b)* ->(") at~*p! 


is now clear. 
Let us compute 


F(1,-1) = (goede + Fedv) (f)(ts=1) 


for f(z, y) = 2° + xy*. For k = 4 formula (1.15) becomes 


(a: — (N,-1) = (5) haa, -2)aets 


4 a 3 4 Of 2 2 


(tb. L)dedy + (| ) zea, ~1)dy*. 


Now, everything reduces to the computation of the mixed partial 
derivatives. 


OF 
art 


Og: Oar 


2 (1 1190) = Fay (1,-1) =0 Fame =0, 
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otf en of 
— (1, - = —24, —(1,-1) = 24. 
Hence, 


4 
(site x sty) (f)(1, -1) = 120d2* — 96dady* + 24dy*. 


If we want to compute the value of this last differential at (2,3) for 
instance, we obtain 


120 - 24 96 -2-3° + 24-3* = —1320. 
Let us now compute 
2 

a a Ore 0) (te : dy 4 oa :) (f)(4, 1,0) 
for f(x,y, z) = 2?+y?+xz+yz. To be easier, let us recall the elementary 
algebraic formula: 

(a+b+c)? =a? +6? +c? + 2ab+ 2ac + be. 

Using the above multiplicity between operators, etc., we get 


2 
oF (1,1, 0)d0” + FF(11 007 cP 


O° f O° f 
dxdy Oxdz 


d’ f(1,1,0) = 


Of 
Hap (ts 1,0)d2? +2 
Of 
OyOz 
If one wants to compute d?f(1,1,0)(3, 4,5) we get 


@ f(g, 0)(3, 4:5) = 2+ 3? + 2AM 2 3-5 4 2. 45 = 120: 


(1,1, 0)dxdy + 2——(1, 1, 0)drdz+ 


2 (1,1, 0)dydz = 2dx? + 2dy* + 2dadz + 2dydz. 


Since 


| 
S> mM. ki _k kn 
(ay tagt+...+ Gn)” = lbs ec ee 5 


ki tket...tkn=m,k,EeNn 


one has the following definition of the m-th differential of f at a point 
acA: 


| mM 
= y is ales dl ch? dak” 
ai ; 

itioe ae =a yhilka!...Kn! Oat Oxh?...Oxkn n 
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where in these last two sums ky, ko,...,k, take all the natural values 
under the restriction k; + ko +... +k, =m 


2. Chain rules in two variables 


During the mathematical modeling process of the physical phenom- 
ena, usually one must find functions z = z(x,y) which verify an equality 
of the following form (a partial differential equation of order 2, i.e. a 
PDE): 

2 2 Q? 


Ale. w) Sa(0,u) + 2Ble. yee (ey) + O(e, vole) 


Oz Oz 
(2.1) mee (ay, 20, y), 9g y), By ») an 0, 


where A, B, C, E are continuous functions of the indicated free vari- 
ables. Relative to E we must add that it is a continuous function 
E(X,Y,Z,U,V) oF 5 free NanAbles, where instead of X,Y, Z,U,V, we 
| SU a oa WG y), 2 ae (2, ¥) and 2 ag (2, ») respectively. In order to find all 
the functions z(x,y) of class ce on a fixed plane domain D, which ver- 
ifies (2.1) we change the "old" variables x, y with new ones u = u(z, y) 
and v = v(x,y) respectively (functions of the firsts) such that some 
of the new "coefficients" A, B, or C to become zero. How do we find 
these new functions u = u(x, y) and v = v(z, y) is a problem which will 
be considered in another course. Our problem here is how to write the 
partial derivatives, 


O7z Or z O2z Oz Oz 
ya y); Dray” y), aya y), ag y) ae y) 


as functions of wu and v. The transition from the "old" variables to the 
"new" ones u and v are realised by a "change of variables" function 
F(z, y) = (u(z,y), v(x, y)) such that F is invertible and of class C! on 
its definition domain. Moreover, its inverse G = F~’ is also a function 
(in variables u and v) of class C' (see also the section "Change of 
variables"). Let Z be the composed function zo G. Hence, z = Zo F, 
or 


2(u(a, y), v(@,y)) = 2(@,y). 


The chain ps formulas (2.9) and (2.10) supply us with formulas for 


pe (a,y) and $2 (a,y) : 


(2:2) 7 7 
Fay) = Sa (ula. 9), v2.9) Seay) + 5 (uly) va, y)) SoC) 
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Seley) = Se (uley),v(ey)) seleru) + 52(ulau), wley)) 5-0) 


Let us use these formulas to find a similar formula for fa (x,y). For 
this, let us denote by g(x, y) and by h(x, y) the new fanetiond of x and 
y obtained in (2.3) 
ef OZ 
g(e,y) = = (u(e,y),0(@,y)) 
and 


gutters) Mors) 


Let us compute $4 J(x,y) and 2 ch (x,y) by using the formula (2.2) with g 
instead of z and 7 instead ve z respectively: 


Fel = os (Ft (x,y), v(x ) ote, yt 
ry Oz O Oz 
Ov (F=(w(e.u).0(0u)) 7 (t,y) = a <u u(z,y), v(2, > Coe 
2 (u(z,y),v(ey)) 22 (0.0). 
and 
ne.) a o (Few, )-v(e.9))) ; (x,y)+ 
(2.5) 


OZ Ov 
<a (ul, ),0(0.0)) 5(2 9). 
Let us come back to formula (2.3) and let us differentiate it (both sides) 
with respect to x. We get: 
Ore Og Ou Oru 
Oh Ov O?u 
th ; 
7 tr) Oy (zy) + ha Di (x,y) 
If we take count of the formulas (2.4) and (2.5) we finally obtain: 
Oe _ OF Ou Ou 
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+= (u(x, y),0(0,0)) [Se cenge lew) pelea) sete.) dhe 


Oz Ov Ov 
+z lulz, y), v(2, Wa, (x, y) dy (x,y) I 
OZ Oru OZ Ou 
ta (ula, y), v(e, Bray y) I Ov (u(x, y), v(a, YN) Feo @Y)- 


We can simply rewrite this formula as: 
Oe. O27 O00 PZ E dv | Ou se 


OxOy Ou? Ox Oy | Odudv | Ox dy ° dy Ox 


PZ 0v0v | OF Pu | OF Av 
Ov? Ax Oy | Oudxdy | Ov Oxdy’ 
If in this formula, we formally put x instead of y we get another useful 
formula: 
en Pz _ Pz (du\” Pz dud _ Pz (dv\" | 
; Ox? Ou? Ax) ° ~ Oudv dx Ax * Av? \Ax) 
OZAu OF Av 
Ou Ox? ~~ Ov Ox?’ 
If here, in this last formula, we put y instead of x, we get the last useful 
chain rule formula: 
OPO (#) Oz udu OZ (2) 


OS) Oy? ~ Ou2 Oy | Oudv Oy Oy © Ov? \ Ay 


OZ O'u | Oz 0'u 
Ou Oy? — Av Oy?’ 


EXAMPLE 16. (vibrating string equation) Let S be a one-dimensional 
elastic wire (infinite, homogeneous and perfect elastic) which vibrates 
freely, without an exterior perturbing force. It is considered to lay on 
the real line Ox. Let y > 0 be time and let z(x,y) be the deflection of 
the string at the point M of coordinate x and at the moment y. If one 
write the D’Alembert equality, which makes equal the dynamic New- 
tonian force and the Hook elasticity force, we get a PDE of order 2 
(the vibrating string equation): 

O2% 30°% 
Oy? ” 8x2’ 
where a > 0 is a constant depending on the density and on the elasticity 
modulus. In order to find all the functions z = z(x,y) which verify the 


equality (2.9), i.e. to solve that equation, we must change the variables 
x andy with new ones u=x—ay andv =x+ay (see the Differential 


(2.9) 
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Equations course). Let us use chain formulas (2.7) and (2.8) in order 
to change the variables in the equation (2.9): 


Oez «=O O°z OR 


0x2 Ou2 | udu | Av?’ 


and 
OF: OF 5 OZ Oe 
a So a a 
Oy? = Ou? Oudv Ov? 
If we substitute these expressions in (2.9) we finally get 


Oz 
2.10 =0 
( ) Oudv 
But this last PDE of order 2 can easily be solved. From 2.10 we obtain: 


2 (=) = (tae: oe is only a function h(v). Hence, 


Zo) = io = f(v) + g(u) 


(why?), where f and g are two arbitrary functions of class C? on some 
open real subsets. Coming back to x and y we finally get the "general 
solution" of the vibrating string equation: 


z(x,y) = f(x + ay) + g(x — ay). 
Other examples in which we use higher chain rules (here "higher" 
means 2 > 1!) will appear in the section "Change of variables". 


3. Taylor’s formula for several variables 


In Theorem 44 we obtained an approximation of a function of one 
variable, of class C™t' on an e-neighborhood (a — ¢,a +) of a fixed 
point a, with a polynomial (the Taylor’s polynomial) of degree m (m is 
a fixed natural number). We also estimated the error in this approxi- 
mative process. We write again this classical and fundamental formula 
and try to generalize it to the case of a function of n variables. 


"a "aq by (n) a 
(sia p(a)+ ) (2 — ay ) (x — a) fee _ Vegcexe 
© n+l 
eae iyi DI (z — a) 


where c is a number between x and a. Let us write again formula (3.1) 
by putting h = x —a, ort =at+handc=a+t,h, where t, € (0,1) 
(t, = =—*, why’): 


PQ, F@ | f(a) yn fr(a+th) prtl 
qr Oi ee Saal ~  (n+1)! 


flath) = f(a)4 
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It is enough to generalize this formula for a scalar function of n variables 
because, if f = (fi, fo,..., fe) is a vector function with k components, 
we simply write the Taylor formula for any component, separately, i.e. 
we approximate componentwisely. 

Let A be an open subset of IR” and let f : A — R be a function of 
class C™*! on A. Let a = (aj, q2,...,d) be a fixed point of A and let 
V = B(a,r) be an n-dimensional open ball (see its definition in Chapter 
6, Section 1) with centre at a and of radius r > 0 which is contained 
in A (why such thing is possible?). If a point x = (21, %2,...,%n) is in 
the ball V, the whole segment 


[a,x] = {z = a+t(x — a) :t € [0,1]} 


is contained in V (why?-in general, a ball is a convex subset...prove 
it!). A subset C' of R” is said to be convex if whenever a and b are in 
C,, the whole segment |a, b] is contained in C. 


THEOREM 72. (Taylor’s formula for n variables) With the above 
notation and hypotheses, for any h = (hy, hg, ..., Wn) small enough, such 
thatx =a+heV (h\| <1), one has the following Taylor’s formula: 


(3.3) f(ath) = fla) + df (a)(h) + 5 fla)(h) ++ a" f(a)(h) 


where c € (a,at+h), i.e. c=a+t,h for at, € (0,1). 
PROOF. (n = 2) Let 
a= (a1, a2),* = (x1, %2),h = (ha, ha), hy = %1— ay, he = © — ag. 


The segment |a, x] is the usual segment with ends a and x in the plane 
xOy (see Fig. 8.1). Let us restrict f to the segment [a,x]. This means 
that to any point a+th, t € [0,1] we assign the number f(a+th). One 
obtains a mapping t ~ f(a+th), denoted here by g: [0,1] — R, 


g(t) = f(atth) =f(a, + thy, ag + the). 


Let us denote by u; and wz, the functions u;(t) = a, + th; and respec- 
tively u2(t) = ag + the. So, if 


u(t) = (ay + thy, ag + tha), 


i.e. if u = (uy, U2), one has that g = fou. Here u is a continuous one- 
to-one mapping from [0,1] onto [a,x]. Since u is of class C™ on (0, 1] 
(why?), we see that g is of class C™*' on [0,1]. Let us apply Mac 
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Laurin’s formula (1.16) (or the general Taylor formula (3.1) with a = 0 
and x = 1) for the function g : 


(3.4) ; 
9(1) = 9(0) + 5 9'(0) + 59"(0) + ---4 


where t, € (0,1). Since g(1) = f(a+h) and g(0) = f(a), one has only 
to prove that g)(0) = d* f(a)(h) for any k = 1,2,...,m +1. We can 
use mathematical induction to prove this. Here, we prove only that 
g'(0) = df(a)(h) and that g’(0) = d?f(a)(h). For this purpose we use 
the chain rules formulas and the definition of the differential of order 
k;. Indeed, 


(3.5) g(t) = 5 [ur (A), wa(t)] - i (t) + = fui(t) ual) “Us(t). 


Hence, 


Since u(t) = 0 and u4(t) = 0, one has: 


" —- Of a 
g (0) = pp ee 


of 
02x10 


If we take c = a+t,h, one gets the formula (3.3) for n = 2. 


3. TAYLOR’S FORMULA FOR SEVERAL VARIABLES 183 


Fig 8.1 


Let 
P(x, y) = 2x*y4+ 38ry?+a+y 
be a polynomial of two variables x and y. Let us write P(z,y) asa 
polynomial Q(x — 1,y + 2), ie. 


P(x, y) = @o0 + G1o(a — 1) + ao1(y +2) + @ao(a — 1)? + ani (x — 1)(yt+2)+ 


do2(y + 2)” + ago(@ — 1)° + aar(x — 1)*(y + 2)+ 
ay2(a — 1)(y + 2)° + aos(y + 2)°. 

We stop here because the "total" degree of P(z,y) is 3 = 2+1. We 
could find the coefficients a;; by elementary tricks (do it!). However, 
let us use Taylor formula (3.3) with 

a= (1,-2),x =(z,y),h1 =e —-lho=yH2, 
etc. We have only to compute dP(a), d?P(a) and d?P(a) (why not 
d*P(a)?). So, 

OP OP 

= Dg ade + Gye = (Ary + 3y? + 1) |a,-2) dx 


+(2x? + 6ry + 1) |a,-2) dy = 5dx — 9dy 


dP(a) 


dP(a)(h) = 5(a — 1) — 9(y+ 2). 


184 8. TAYLOR’S FORMULA FOR SEVERAL VARIABLES. 
Hence, 
aoo = ee —2) = 7; A109 = D5 ao. = —9. 


The coefficients agg, a1; and agg can be computed from the expression 
of +d’P(a)(h). Namely, 


0?P 0?P 
Faz (2) = (4y) |a.-2= —8, dnoy >? = (4x + 6y) |a,-2)= —8 


and 5 (a) = 62 |(1,-2)= 6, ie. 


5° P(a)(h) = ~4(n — 1)? — 8(n ~ 1)(y +2) + 3(y +2)? 


and so, dga9 = —4, a1, = —8 and ao2 = 3. In order to find ago, a1, ai2 
and ao3 one must compute 

L: ik O?P é‘ OP P 

gt la)(h) = 5 | alate — 8+ 355 aye — 1% +2) 


OP Oo? P 
3 — 1)(y+2)*4 + 2)° 
+35 pa l@lle— y+ 2° + Fea +2) 
= 2(@ — 1)?(y +2) +3(@ — 1)(y + 2)’. 
Thus, a39 = 0; a2] = 2; ay2 = 3 and ag3 = 0. Finally one has: 
P(a,y) =7+5(2 — 1) — 9(y + 2) —4(@ — 1)? — 8(2 — 1)(y + 2)+ 
+3(y + 2)? + 2(x — 1)?(y + 2) + 3(x — 1)(y + 2)”. 
THEOREM 73. (Lagrange’s Theorem for many variables, or the 
Mean Value Theorem) Let A C R” be an open subset of R", let a 
be a point in A and let V = B(a,r) C A, r > 0 be a ball with centre at 
a and of radius r. Let f : A — R, be a function of class C! defined on 


A. Then, for any x in X, there is a point c in [a,x] such that: 
(3.6) 


TA) al i Mai a) Me) tas) herad fie), 


i.e. the "increasing" f(x) — f(a) of f on the interval |a, x] is equal to 
the scalar product between the gradient vector grad f(c) of f at a point 
c of the segment |a,x], and the the vector x —a. If x is very close to 
a, then we have an "affine" approximation of f(x) : 


G7) fx) ~ fa) + SE (@)ler — a1) ++ FA) = an), 

or a linear approximation of f(x) — f(a): 

(3.8) 

Fl00)— fa) © SA (a)(0y—ay) +. + F4 (a) nan) = (ered fla), h). 
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PROOF. It is sufficient to take m = 0 in the formula (3.3). 


From formula (3.7) we see that it is sufficient to know the gradient 
vector grad f(a) of a function f at a point a and the value f(a) of the 
same function at a, in order to approximate the values of this functions 
in a neighborhood of a. For instance, let us compute approximately 
sin 46° cos 1°. For this, let us consider the function of two variables 
f(x,y) = sinxcosy, the point a = (7,0) and the point x = (4 + 


igo’ 19): Lhen, formula (3.7) says that: sin 46° cos 1° ~ vay v2. aa 


4. Problems 
1. Compute df and d?f for: 


f(x,y) = sin(a? + y’); 
flay, z)=Verty + 2; 


f(x,y) = exp(ry) 

at (1,1); find also df(1,1)(0,1) and d?f (1, 1)(0, 1). 

2. Approximate Af = f(z,y) — f(xo, yo) by df (ro, yo) (Az, Ay), 
where Ar = x — Xo, Au = y — yo and then compute: 

a 

) 
f(a,y) =a 

at the point A(e + 0.1,1+ 0.2); 


b) 
f(t,y) = Va? t+y? 
at A(4.001, 3.002); 
c) 
fla,y) =a" 
at A(1.02; 3.01). 
3. Use Taylor’s formula to approximate f by the Taylor polynomial 
T,, with Lagrange’s remainder: 
a) 
f(z,y) =m(14+2)4+In(1+y) 
at (0,0), with 74; 
b) 
fla,y) =a 
at (1,1), with T; and compute approximately (1.1)1?; 
Cc) 


f(@,y) = (exp) siny 
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at (0,0) with Tp; 
d) 
f(z,y,z) =e? +y' + 2 — 32yz 
at (1, 1,1), with Tp. 
4. Write 
P(x, y) = 2a? — 3x?y + 2y* + 92? — 3y + 62 +3 
as Q(x + 1,y—1). 
5. Compute approximately (0.95)?°; Hint: take 
g(x,y) =y" 
around A(2,1) and use 7). 
6. Compute d?f(0,0,0) for 
flayy,2) = 2? +y? + 2* = Qay* + Byz = 52727. 
7. Compute d?f(0,0)(0,0) for 
f(x,y) = cos(3x + 2y). 
8. Prove that 


a) 


verify the "heat equation": % (x,t) = a FU (x, t). 


9. Use Taylor’s formula to justify the following approximations: 
a) 


cost x? — y* 


cos y 7 2 
around (0,0); 
b) 


arctan Pied Sxux+y, 
+ ry 
around (0,0); 
c) 
around (0,0). 
10. Find df(1, —2)(2, 3); d2f(1, —2)(2,3) and d?f (1, —2)(2,3) for 
f(a,y) =a? + 2a7y. 


In(l+<2)-M(1+y)® zy, 


CHAPTER 9 


Contractions and fixed points 


1. Banach’s fixed point theorem 


Let (X,d) be a metric space, i.e. a set X with a distance function 
d on it. This function d associates to any pair (x,y) of elements of X 
a nonnegative real number d(x, y) with the following properties: 

i) dla, 9): =0 i and only if 7 = yy 

ii) d(x, y) = d(y, x) for any x, y in X and 

iii) d(z,z) < d(x,y) + d(y,z) for any x,y,z in X (the triangle 
inequality). 

This triangle inequality can be generalized and one obtains the 
polygon inequality: 


(1.1) d(x, tp) < d(xo, 21) + d(x1, 2) + d(x, 03) +... + d(Ln_1, Ln). 


for any finite sequence {2%o, 11, V2, ...,U,} of X. It can be easily proved 
if we use mathematical induction on n. For n = 1, or 2, it is clear. 
Suppose n > 2 and assume that the polygon inequality is true for any 
sequence of k < n elements of X. Let us prove it for a sequence of n+ 1 
elements {%0, 11, 2, ...,%}. Thus, 


(1.2) d(Xo, 2n-1) < d(xo, 21) +d(x1, 22) +d(ro, 13) +...+d(&n_2, Tn—1)- 


Now, 


a( 20303) < dt; Sar) + U(Caay a): S 
[d(xo, 1) + d(#1, 22) + d(x, 43) +... + A(Gn—2,Ln-1)| + d(Gn_-1, En). 
and the proof of (1.1) is done. 
We just met many examples of metric spaces: (R, d(x, y) = |x — y|), 
(C, d(z,w) = |z—w)), (R", d(x, y) =|[x— yl), Cla, 6] = {f : [a,] > 
R, f continuous} with 


Uf, 9) = lf — gl| = sup{| f(z) — g(@)|: « € [a, Bf, 
etc. All of these metric spaces are complete metric spaces, i.e. metric 
spaces (X,d) with the property that any Cauchy sequence has a limit 
in X. Not all metric spaces are complete. For instance, X = (0, 1] with 
the same distance like that of R is not complete, because the sequence 
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{+} is a Cauchy sequence in X but it has no limit in X (why?). It is 
easy to see that a subset Y of a metric space (X, d) is complete relative 
to the same distance like that of X if and only if it is closed in X (prove 
igh 

DEFINITION 32. (contraction) Let (X,d) be a metric space. A func- 


tion f : X — X is said to be a contraction on X if there is a number 
A € (0,1) such that 


(1.3) df (x), Fly) < Ad(a, y) 
for any x,y in X. This number > is called the (contraction) coefficient 
of f. 

For instance, f : [0,1] — [0,1], f(x) = 0.52 is a contraction of co- 
efficient 0.5 (prove it!). But g: R > R, g(x) = 2z, is not a contraction 
on R but,...it is a contraction on [0, 0.44] (prove it!). 

Any contraction on X is a uniformly continuous function on X 
(why?). The same result is true even \ is an arbitrary positive real 
number. In this more general case we say that f is a Lipschitzian 
function on X. 


THEOREM 74. Let A be a convex subset of R" (ifa and b are in A, 
then the whole segment |a, b| is in A). Let f: A— A be a function of 
class C! on A such that all the partial derivatives of f are bounded by 
a number of the form X/n. where X € (0,1). Then f is a contraction of 
coefficient \ on A. 


PROOF. Let us take a,b in A and let us write Taylor’s formula for 


=0 (b=a+h) 
(1.4) 
Fb) = f(a) =F (6) (b1- a1) + 5A (6) + (baa) ZA (6) « (On—tn), 
where c is a point on the segment [a,b] and a = (ay, de,...,@,), b = 
(bis Baiarssbs) 

So, 


d(f(a),f(b)) =||F(b) 


< y: ale |a— bl < Adfa,b). 


Thus, our function is a contraction. 


a — b| 


af 
I< |d 520] 


3 


For instance, f(x) = 2° is a contraction on (0, 1], because | f’(x)| = 


2 |x?| < 2 on (0, 1]. 
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THEOREM 75. (Banach’s fixed point theorem) Let (X,d) be a com- 
plete metric space and let f : X — X be a contraction of coefficient 
A € (0,1). Then there is a unique element x in X such that f(x) = x 
(a fixed point for f). This unique fixed point x of f on X can be ob- 
tained by the following method (the successive approximates method). 
Start with an arbitrary element x9 of X and recurrently construct: 
ty =f (ig) ot Ba ies BT gs ye en, the sequence {ay} 
is convergent to this fixed point x. Moreover, if we approximate x by 
In, the error d(x,X,) can be evaluated by the following formula 

\” 
(lise) d(x, &n) < d(x, Xo) - ——~. 
1-2 

PROOF. It is sufficient to prove that {z,} is a Cauchy sequence 
(why?-remember that X is complete so, 7, — x, then use the continuity 
of f in the recurrence relation-take limits and find x = f(x)). Let us 
evaluate the distance between the terms of the sequence {z,,} by using 
the contraction formula (1.3). 


d(x2,21) = d(f (x1), f(®o)) < Ad(x1, 20), 


d(x3, £2) = d(f (x2), f(x1)) < Ad(x2, 21) < d*d(21, 20), 


and so on, up to a general relation (use mathematical induction if you 
want!): 


(1.6) Gy iis tie) SN aes fo): 


Now, 
(1.7) 
El Mints En) < Baap, Big | a A inept Dipped) ae dl eis, Fi) 


comes from applying of the polygon inequality (1.1). If in (1.7) we 
introduce the formula from (1.6), we get: 


d(tn+ps%n) S (APE + AMP! +... +A") (21, 20) 


id” 


(1.8) <AM(L4EA4A +...)d(21, 20) = = 


d(x1, X0). 


Since “4 — 0, independently on p, the sequence {x,,} is a Cauchy 
sequence. Since (X,d) is complete, this sequence has a limit x = lim z,. 


Making p — co in (1.8) we get the desired estimation of the error: 


n 


d(x, Ln) S 


_ Toate, Xo). 
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(why d(Znip,2n) — d(x, Zn) if p > oo? Prove it!). Since z, = f(¢n-1) 
and since f is continuous, one has that 7 = f(x). This fixed point x is 
unique. Indeed, if x = f(x) and y = f(y), then 


Ux, y) = (f(x), f(y) < Ad(x, y), 
or 
d(z,y)-{A—1] > 0. 
Since A € (0,1) and since d(x,y) > 0, the unique possibility is that 
ey) = 0) 16. By 


The Banach’s fixed point theorem has many applications. For in- 
stance, it can be used to find approximate solutions for equations and 
system of equations (linear or not!). 

Take for example the polynomial 


P(x) =2°-2? 422-1 


and let us search for a solution of the equation P(a) = 0 in the interval 
X = [0,1]. The equation x? — x? + 2x — 1 = 0 can also be written as: 


2 
(1.9) 2 ~ ; =x. 
Let us prove that f(x) = et is a contraction on [0, 1]. Indeed, f’(x) = 
wy and 
2% 2 1 
(x? + 2)2| — 2 


(why?) on [0, 1]. Applying Theorem 74 

we get that f is a contraction of coefficient \ = S. So, the equation 
(1.9) has a unique solution a in [0,1]. Let us find it approximately with 
"two exact decimals". Formula (1.5) says that: 


1\" 2 i> 
la —2z,| < (5) 7 (fea — ol = (5) |z1 — Xo|. 


Let us take xp = 0. Then x; = f(xo) = 5. Thus, 
1 
la—z,| << —. 
Qn 
If we force with sf = nea we get n = 7. Hence, the true solution a is 


approximately equal to 


ta=(fofofofofofo f)O) = fFFFFF(F()))))))- 
This last number can be easily find by using a cyclic instruction in a 
computer language, like Pascal or C++. The committed error is less 
then 0.01. 
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2. Problems 


1. Using the Banach’s Fixed Point Theorem, find approximate 
solutions with the error ¢ = 10~? for the following equations: 

a) x? +2—5=0; b) 2? —sinz = 3; c) t= Z% cos. 

2. Which of the following mappings are contractions? Study the 
fixed points of them. 

a) f: ROR, f(e) = 2;b) f: ROR, f(x) = 27; 0) f: CC, 
f(z) = 24; 
d) f:C+C, fz)=2+z2+1e) f:R-R, f(x) = 5243; 

f) f: ROR, f(x) = garctanz; g) f: R>R, f(z,y) = (78, By). 
3. Try to find approximate solutions with 2 exact decimals for the 
following linear system of algebraic equations: 


100z + 2y=1 
4x + 200y = 5 * 


Hint: Write this system as: 


0.01 — 0.02y = x 

0.025 — 0.02% = y * 
Prove that the vector function f : R? — R?, defined by the formula, 
f(x,y) = (0.01 — 0.02y, 0.025 — 0.02) is a contraction of coefficient 
0.02 x 2 < 1. Then apply the Banach’s Fixed Point Theorem. At the 
end, compare the approximate result with the exact one! 

4. What is the particularity of the system from Problem 3? Can 

we apply the Banach’s Fixed Point Theorem to all the linear systems? 


CHAPTER 10 


Local extremum points 


1. Local extremum points for many variables 


Let A be an open subset of R” and let f : A — R be a scalar func- 
tion defined on A. We say that a = (aj, d2,...,@,) is a local maximum 
(minimum) point of f if there is a small open ball B(a,r) C A, r > 0, 
such that f(x) <f(a) (f(x) >f(a)) for any x in B(a,r). Local maxima 
and local minima are referred to as local extrema. A local maximum 
point or a local minimum point is called an extremum point. 


REMARK 30. Let A be an open subset of IR" and let i be a fixed 
natural number in the set {1,2,...,n}. Then the i-th projection pr;(A) 
of A is the set of allt € R such that there is an 


x = (x1, U2, +--+, Vi-1, t," Vit, cory Bi) 


in A with t at the i-th position. It is also an open subset of R. Indeed, 
take to € pr;(A) and take a in A such thata = (a1, ..., @i—1, to, Qi41, ---, An). 
Since A is open, there is a ball B(a,r) C A with r > 0. We prove that 
the 1-D ball (to — r,to +r) is contained in pr;(A). It is in fact the i-th 
projection of B(a,r). For this, let u € (to —r,to +1), te. ju — tol <r. 
It is easy to see that 


V = (41, Ga, ..-, Qj—1, U, M41, ---,4n) € Bla,r) C A. 


Thus 
u = pri(v) € pri(A). 


So pri(A) is also open in R. 


THEOREM 76. (Fermat’s theorem for many variables) Let A be an 
open subset of R” and let a€A be an extremum point of a function 
f:A—-R, defined on A with values in R. If f has partial derivatives 
52, (a); J = 1,2,...,n ata, then all of these are zero, i.e. any extremum 


point a of f is a stationary (critical) point for f. This means that a 
is a root of the vector equation: grad f(x) = 0, i.e. grad f(a) = 0, or 
df (a) = 0, if this last one exists. 
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PROooF. Let us fix an i in {1,2,...,n} and let us define a function 
of one variable g; : (a; — r,a; +7) — R by the formula: 


gi(t) = f(a, v0) M41, b, Qi41, oy Gn). 
Here r > 0 is the radius of a small ball B(a,r) which is contained in 
A (see the above discussion). Assume that a is a local maximum point 
for f. We can take r to be small enough such that f(x) < f(a) for any 
x in the ball B(a,r) (why?). If u € (a; —r,a; +r), then 
V = (41, Qa, ...-, Qj-1, U, M41, ---, dn) € B(a,r) 


SO, 
gi(u) = f(a, very Aj-1, U, Qi41, +++, An) < 
< f(a, seey AG—-1, Aj, Aj41,---5 An) = gi(a;). 

This means that a; is a local maximum for the function g;. We use now 


Fermat’s theorem 35 for the one variable function g; at the point aj. 
Thus, g;(a;) = 0. But 


g,(t) = 7a, (ow stay ai_-1,F, Qj415 ++) Gin) 
Hence, g}(a;) = sf (a) = 0, for any i = 1,2,...,n and the proof of the 


theorem is complete. 


The Fermat’s theorem says that for the class of differential functions 
f defined on an open subset A of R”, the local extremum points must 
be searched between the critical points, i.e. between the points a which 
are zeros for the gradient of f. For instance, for f(x,y) = 74+ y*, the 
gradient of f is grad f = (4x°, 4y?). So, one has only one point (0,0) 
which makes zero this gradient. Since 0 = f(0,0) < 2+ +4, for any 
x,y € R, the point (0,0) is a "global" minimum point for f. It is easy 
to see that for the function h(x, y) = 2?—y?, the point (0,0) is a critical 
point, but it is neither a local minimum, nor a local maximum point for 
f, because, in any neighborhood of (0,0) the function h(x, y) has pos- 
itive and negative values (why?). So we need a criterion to distinguish 
the local extremum points between the critical points. We recall that 
a quadratic form in n variables Xj, X2,...,X» is a homogeneous poly- 
nomial function g(X1, Xo, ...,Xn) of degree two of these n independent 
variables, 


g(X1, X2,--, Xn) = So SS ais Xi Xj, 
i=1 j=l 
where a,; = aj; for all 7,7 € {1,2,...,n}, ie. if its associated n x n 
matrix (a;;) is symmetric. Here this last matrix is considered with 
entries in R. We say that the quadratic form g is positive definite if 
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g(X1, 2, ...,%n) > 0 for any real numbers 21, %2,..., 2, and, it is zero if 
and only if all of these numbers are zero. For instance, 


(XY) HX + XY +Y" 


is positive definite. Assume contrary, namely we could find (x,y) 4 
(0,0), say y £ 0, such that 

g(z,y) =x? +ayt+y’? <0. 
Let us divide by y? and put t = x/y. We get t? + t+ 1 < 0, which is 
false because 

?+t+1=(t4+1/2)?+3/4 
cannot be negative for ever (why?). Moreover, if x? + ry + y* = 0 and 
if (x,y) 4 (0,0), then we obtain #?+¢+1=0 fort =2/y ort = y/z. 
But the equation Z? + Z + 1 =0 has no real root! 

We say that the quadratic form g is negative definite if 


OG se Toya th) <0 


for any real numbers 21, %2,...,%, and, it is zero if and only if all of 
these numbers are zero. For instance, 


Q(X. Y)= —X* = XY -—Y? 


is negative definite (prove it!). If a quadratic form is negative definite 
or positive definite, we say that it is definite. If it is neither positive 
definite, nor negative definite, we say that it is nondefinite. For in- 
stance, g(X,Y) = X? is a quadratic form which is nondefinite because, 
for x = 0 and any y # 0, it is zero! A basic result in the theory of 
quadratic forms (see any serious course in Linear Algebra!) gives us a 
criterion which says when a quadratic form is positive definite, negative 
definite, or nondefinite. The point is to consider the principal minors 


G4, G12 - + Ain 
91 G22 . +. Gan 
Q41 12 
Ay = ay, Ag = | | re 
Q21 422 
Ant Gn2 - + Ann 


of the matrix (a;;). 
THEOREM 77. (Sylvester’s criterion) A quadratic form 
g(X1, Xo, seey Xn) = » S- Gap AGG 
i=1 j=1 
is positive definite if and only if 
Ai> 0, As > 0, As > Oy pees > 0. 
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It is negative definite if and only if 
Ai < 0, As > 0, As < 0, Ag > Oss (—1)"A,, > 0. 


If none of these both conditions are fulfilled, the quadratic form g is 
nondefinite. 


For instance, 
g(x, y,2) =a? +y? — 2° 


is nondefinite because A, = 1 > 0, Ay =1>0 and A3 = —-1 <0. 
Now, we are ready to prove our above announced criterion for dis- 
tinguishing the local extremum points between all the critical points. 


THEOREM 78. (The Decision Theorem) Let f : A > R be a func- 
tion of class C? (it has continuous partial derivatives of second order 
on A) defined on an open subset A of R". Leta € A be a critical point 
of f and let 


g(1, ho, ..-, hn) = a? f(a)(ha, ho, ..., hn) 


be the second differential of f at the point a. It is in fact the quadratic 
form 


non 6? f 
g(hi, he, sesh hn) = Ss" S° FaDa a nales: 
a4 ge j4OX 5 


i) Assume that d? f(a) is not identical to zero and that d? f(a) is a 
negative definite quadratic form. Then a is a local maximum point for 


f 


ii) Assume that d* f(a) is not identical to zero and that d? f(a) is a 
positive definite quadratic form. Then a is a local minimum point for 


Let k be the first natural number such that f is of class C* on A 
and d* f(a) is not identical to zero. 
iii) If k is even and if 


d* f(a)(ha, ho, Pty hn) <0 


for any hy, ho,..., hn not all zero, then a is local maximum point for f. 
iv) If k is even and if 


d* f(a)(ha, ha, ..., tn) > 0 


for any hy, ho, ..., hn not all zero, then a is local minimum point for f. 
If k is odd and d* f(a) #0, then a is not a local extremum point. 
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PROOF. Let us denote by h the variable vector (hj, hg, ..., Wn) and 
let us write Taylor’s formula (3.3) for m = 1. We get: 


(1.1 fla +h)-fla) =5@Flex)(h) 


where Cy is a point on the segment [a,a + h] and ||h|| <r, with r > 0, 
sufficiently small real number such that B(a,r) C A and. Here df (a) 
0 because a was considered to be a critical point. ary d? f(x) is 
continuous as a function of x (d’f(x)(h) =o}, O74 joe oF (x)h;h;) 
and the second order derivatives are continuous by our hypothesis!), 
eventually in a smaller ball B(a,r’) with centre at a and of radius 
r’ <r, one has that the sign of d?f(x)(h), x € B(a,r’), is the same 
like the sign of d?f(a)(h) (why?). Hence, the sign of the difference 
f(a+h)—f(a) is the same with the sign of d?f(a)(h) for ||h|| < 7’. 
Now, the statements of the theorem becomes very clear. Indeed, let 
us consider for instance that the quadratic form d?f(a) is negative 
definite, ie. d?f(a)(h) < 0 for any h 4 0. Then d?f(x)(h) <0 for any 
x in a small ball B(a,r’) like above and for any h ¥ 0. So, in (1.1), if 
we take h such that ||h|| < 7’, ie. x =a+he B(a,r’), we get that 
f(x) < f(a) for any x in B(a,r’), ie. a is a local maximum point for 
f. To prove ii) we proceed in the same way (do it!). 

To prove iii) and iv) we use the Taylor formula: 


f(a+ h)—f(a) = = aa Fen n) (1a) 


and the fact that a homogenous polynomial P(X, Xo,..., Xn) of odd 
degree k can NEVER have a constant sign in a neighborhood of 0. If k 
is even and if d* f(a)(h) < 0 for any nonzero h, there is a whole small 
ball B(a,¢) on which d* f(x)(h) < 0 for any nonzero h. So, on such a 
ball, f(a + h)—f(a) < 0, i.e. ais a local maximum point for f, etc. 


“loo 


Let us apply this theorem to the following problem. Let 
F(Z, y) =x +y'—4ey, f : R? —>R. 

Let us find all the local extrema for f. First of all we ane the critical 
points: of = = 473—4y = 0 and a= = 4y?—4x = 0 imply x°—2x = 0. So we 
find the following critical points: M1(0,0), M2(1,1) and M3(—1,-1). 


In order to apply Theorem 78 we need to compute the Hessian rabies 
of f, ie. the matrix of the quadratic form d?f, at every of the three 


critical points. 
2 2 
a 3 Beby) _ (120? 4 
NORE SOUF PN As Dy 
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0 —-4 

—-4 0/)° 
Since A, = 0, from Theorem 78 we obtain that M, is not a local 
extremum for f. At Mj and M3 the Hessian matrix is 


12 —-4 

—4 12)" 
So, A; = 12 > 0 and A, = 144 — 16 = 128 > 0. Thus, both Mz and 
Msg are local minimum points. 


At M, the matrix is 


EXAMPLE 17. (regression line) In the Cartesian xOy plane we con- 
sider n distinct points M,(x1, y1), Mo(2, ye), -.-; Mn(Xn; Yn). We search 
for the "closest" line y = ax + b (the regression line) with respect to 
this set of points. Here, the "distance" from the set {M;} up to the line 
y =ax+b is the "square" distance distance: 


n 


(1.2) SD(a,b) = ,| d© [yi — (ai + 8). 


i=1 


The "closest" line y = ax + b is that one for which the nonnegative 
function SD(a,b) is minimum. Thus, we must find the local minimum 
points for the two variable function SD(a,b). Let us find the critical 
points by solving the 2 x 2 system: 


OSD n 

2 = 25™" _, —2;i(y; — ax; — 6) = 0 
1.3 2a. ici ae f 
( { Oe? = 2d, —(yi — a2; — b) =0 


Let us write this system in the canonical way 


Ola2)at+ Ol a)b => ay; 
(1.4) { nae so ae 


If not all the points {M;} are on the same line (in this last case 
the regression line is obvious the line on which these points are!), the 
determinant of this system cannot be zero (use the Cauchy-Schwarz 
inequality from Linear Algebra, the equality special case!). So we have 
a unique solution (ao, bo) of this system. Let us prove that this point 
realize a minimum for the square distance function SD(a,b). Indeed, 
the Hessian matrix of f is 


In this case, A, = 25> 2? > 0 (otherwise all the points M; would be 
on the Oy-axis) and Ay = 4 [nS a? -— (> a . In order to prove that 
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A> is greater than zero we consider in R” the vectors 1 = (1,1,...,1), 
X = (X1,22,...,Ln) and write the inequality Cauchy-Schwarz for them: 
\(1,x)| < |J1|| - [|x|] or (by squaring) (Soa)? < n> 2?. We know that 
equality appears if and only if the two vectors are collinear, t.e. if and 
only if v1 = ta =... = Ly. But this last case appears only if the points 
{M;} are on a vertical line and we just assumed that {M;} are not 
collinear. Hence, Ag > 0 and the point (ao, bo) is a local (in fact a 
global-why?) minimum for the square distance function SD. 


The method described above is said to be the least squares method 
(LSM). It can be generalized to other classes of curves or surfaces. 

Let us apply the LSM for the set of points M,(—1,1), M2(0,0), 
M3(1,2) and M,(2,3). To solve the system (1.4) we must compute 
Sia? = 6, ia; = 2, Say; = 7 and Sy; = 6. Then the system 
becomes: 

6a+2b=7 
{ 2a+4b=6 ~ 


We get a = 4/5 and b = 11/10. Hence, the regression line is y = 24+75. 


2. Problems 


1. Find the local extrema for: 
t (22) =o ty? +27 —ay+a—22; 
f(x,y) = ry (6—x2—y),z > O,y > 0; 


f(z, y) = (e— 2)? + (y+ 7)" 
(try directly, without the above algorithm!); 


f(z, y) = zy(2—2—y); 


f(z,y) =In(1 - 2? -y’); 


f) 
f(z,y) = 2° +y° — 3ay; 
g) 
(e.o)= xt + y* — Qa? + Ary — 2y?; 
h) 


f(x,y, z) = yz(4a -— &—y — 2), 
a, x, y and z are not zero. 
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2. Find a, @,y such that 
f(z, y) = 22? + 2y* — 8ry +ax+ By +7 
has a minimum equal to zero in A(2,—1). 
3. A price function is of the form 
f(a,y) =a? +ay4+-y’ — 3ax — 3by, 
where a, b are constant numbers. Find a and b such that the minimum 


of f be the biggest possible. 
4, Study the local extrema for f(x,y) = x4 + y* — 2. 


CHAPTER 11 


Implicitly defined functions 


1. Local Inversion Theorem 


Let a be a point in R”. By a (open) neighborhood A of a we mean 
any open subset A of R” which contains the point a. So, if A is a 
neighborhood of a, then there is an open ball B(a,r), centered at a 
and of radius r > 0 which is contained in A. 


DEFINITION 33. Let A and B be two open subsets of R”. A vector 
function f : A — B is said to be a diffeomorphism between A and B if: 
i) £ is a bijection; ti) £ is of class C! on A and iii) f£-!: B — A is of 
class C' on B. 


For instance, fy : R — R, fa(x) = +a is a diffeomorphism because 
its inverse g(x) = x—a is of class C! on R. But the mapping f : R — R, 
f(x) = 2° is not a diffeomorphism because its inverse g(x) = ~/z is not 
differentiable at 7 = 0 (why?). 


REMARK 31. It is easy to see that the composition between two 
diffeomorphisms is also a diffeomorphism (prove it!). 


THEOREM 79. Let f: A — B be a diffeomorphism and let a be a 
point in A. Then the linear mapping df(a) : R” — R” is an isomor- 
phism of real vector spaces. In particular, the Jacobi matrix Ja of £ at 
a is invertible and its determinant has a constant sign in a neighborhood 
of a. This means that there is an open ball B(a,r), r > 0, contained in 
A, such that det Jx,¢ > 0 (or det Jx¢ <0) for any x € B(a,r). In fact, 
the sign of det Jx¢ 1s the same with the sign of det Jag for any x in 
B(a,r). 


Proor. Let g: B= A be the inverse of f and let b = f(a). Then 
gof=1,, the identity mapping defined on A. Now, Theorem 69 says 
that Jpg: Ja = Inxn, the nxn identity matrix. Hence, the Jacobi ma- 
trix Ja is invertible, i.e. df(a) is an isomorphism of real vector spaces 
(see the connections between the linear mappings and their correspond- 
ing matrices, w.r.t. a fixed basis in R”). Moreover, det Ja cannot be 
zero (why?), say positive, for instance. Since f is a function of class C' 
on A, all the partial derivatives which appear as entries in the matrix 
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of Jx are continuous. Thus, the mapping x ~~ det J, (denoted here 
by T) is a continuous mapping on A, particularly at a. Since T(a) > 0, 
we state that there is at least one small positive real number r > 0 
such that for any x in B(a,r) we have T(x) > 0. Indeed, otherwise, we 
could construct a sequence {x} of elements in A which is convergent 
to a and for which T(x”) < 0, m = 1,2,.... The continuity of T would 
imply that T'(a) < 0, a contradiction! Hence, there is such a small ball 
B(a,r), r > 0 on which T(x) is positive and the proof is complete. 


Thus, locally, around a fixed point a, the differential df(x) is in- 
vertible. We know that the increment f(x) — f(a) of the function f at 
a can be well approximated by df(a)(x — a) (see Taylor’s formula for 
many variables). A natural question arises: " Is f itself invertible in a 
neighborhood of a?” If the function f describes a physical phenomenon, 
this means that this phenomenon can be reversible whenever we be- 
come closer and closer to the point a and, this is very important to be 
known in the engineering practice. The following result is fundamental 
in all pure and applied mathematics. It is a reverse result relative to 
the above theorem 


THEOREM 80. (Local Inversion Theorem) Let A be an open subset 
of R” and let f : A > R” be a function of class C! on A. Let a be a 
point in A such that det Jag #0. Then there is a neighborhood U ofa, 
U CA, such that the restriction of f to U, f |, : U — V = f(U), is 
a diffeomorphism. In particular, det Jx¢ #0 on U andifg:V—-U 
is the local inverse of f (g =(f Ley then det Jee = 


Jigie— ak) 


PROOF. (only for n = 1. See a complete proof in Section 7 of this 
chapter) Let f = f anda=ae€eACR be the usual notation in this 
restricted case. Now det Jag = f’(a) (why?) and the hypotheses says 
that f’(a) is not zero, say that f’(a) > 0. Since f’ is continuous (f is of 
class C' on A), like in the proof of the above theorem, we can conclude 
that there is an open ball U = B(a,r) = (a—r,a+r), r > 0, on which 
f' is positive, i.e. f’(x) > 0 for any x in U. This means that on this U 
our function f is strictly increasing. So, the restriction of f to U has an 
inverse g: V = f(U) — U. Since f is continuous and strictly increasing, 
one can easily prove that f~! = g is continuous on V (prove it! or find 
by yourself a previous result from which this statement immediately 
comes!). We now prove that this function g(y) = x, where y = f(x), 
is differentiable on V. Indeed, let b = f(a) be a point in V and let 
{Yn = f(an)} be a convergent sequence to b. Then {z, = g(yn)} tends 


1 
det Jy ¢ and 
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to a (because of the continuity of g) and 
In —a 1 


lim gyn) — 9°) = lim = : 
yn>b Yn — b tn—a f (Lp) = f(a) f'(@) 
Thus, g is differentiable at b and g/(b) = Flay’ 

EXAMPLE 18. (Polar coordinates) Let M(x,y) be a point in the 
Cartesian plane {O,i,j} and let p = ,/x?+y?be the distance from 
M up to the origin O. Let 0 be the unique angle in |0,27| such that 
x = pcos@ and y = psin@ (prove that such an angle exists and that 
it is unique!-see Fig.10.1). Let us consider A = (0,00) x (0,27) C R? 
and B = R? \ {[0,00) x {0}} in the same R?. Let f : A > B, f£(p,0) = 
(pcos@,psin®@). It is easy to see that det Jip9)¢ = p # 0. It tt easy 
to prove that this f is a diffeomorphism. The analytical expression of 
its inverse f~+ is not so simple (why?-find it!). The new "coordinates" 
(p,9) are called the polar coordinates of M. For instance, the Cartesian 
equation of the circle x2 + y? = R? may be simply written in polar 
coordinates like p = R! 


Fig. 10.1 


DEFINITION 34. (regular transformations) Let A be an open subset 
of R” and let f : A > R” be a mapping defined on A with values in R”. 
We say that f is a regular transformation at the point a of A if there 
is a neighborhood U of a, U C A, such that the restriction of f to U 
give rise to a diffeomorphism f |y: U > V = f(U). If f is regular at 
any point of A, we say that f is a regular transformation on A or that 
f is a local diffeomorphism on A. 


In particular, for a local diffeomorphism f, one has that det Jag 4 0 
on A and, if in addition A is connected, then det Ja¢ has a constant sign 
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on A (why?). For instance, the polar coordinates transformation (see 
Example 18) is a regular transformation (prove it!). The composition 
between two regular transformations is again a regular transformation. 
Such transformations are "good" for engineers. They are locally suffi- 
ciently "smooth". This means that they do not produce "breaking" or 
"noncontinuous (broken) velocities", or "corners". 


REMARK 32. The local inversion theorem applied to the regular 
transformations gives rise to some basic properties of these last ones. 
For instance, a regular transformation f : R” — R” carries an open 
subset A of R” into the open subset f(A) (why?). If A is a domain, 
i.e. if A is an open and a connected subset of R", then f(A) is also 
a domain of R" (why?). Moreover, the Jacobian det Jy, has the same 
sign on A, if A is a domain (try to prove it!). 


2. Implicit functions 


What is the difference between De curves: 1) C, = {(z,y) € R*: 
y = V1—2?} and 2) Cp = {(z,y) : 2? +y? = 1, y > 0}? They represent 
the same object, the half of the circle of radius 1, with centre at O, 
which is above the Oz-axis, but... the FepEOsentSiOns are distinct. in 
the first case we have an "explicit" representation, i.e. we can write 
y = f(x), this means that we can write one variable as a known function 
of the other one. In the second case we have to compute y as a function 
of x from the "implicit" relation x? + y? = 1. In our case this can be 
done, but in other cases such an explicit computation cannot be done. 
For instance, it is very difficult to express y as a function of x if 


(x) 2° + 2y® — 32y = 0. 


But, if we knew that such an expression y = f(x) exists (theoretically) 
in a neighborhood of a point on the curve, say (1,1), we can compute 
the "velocity" f’(1), the "acceleration" f”(1), f’”(1), etc. Practically, 
we proceed as follows. Let us write again the implicit relation (*) with 
f(x) instead of y : 


x + 2f(x)° — 32 f(z) = 
and let us differentiate it with respect to x : 
(#) 3x* + 6f (x)? f'(x) — 3f(x) — 3xf!(x) = 


We see that always (does not matter the implicit relation is!) the first 
derivative f’(z) appears to power 1, i.e. it can be "linearly" computed 
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from (*x) : 


f(a) - 2" 
2.1 (x) = — —. 
(2.1) fa) =e 
If one put x = 1 in (2.1) one obtains f’(1) = 0. If we differentiate 
again formula (2.1) with respect to x, we get 


_ ~2f(2)f'(@) — def (2)? — 2f (a) +40 fla) f'@) + f@ +2 
[2f (x)? — 2]? 


If here we substitute f’(x) with its expression from (2.1), we get the 
expression of f(x) only as an explicit function of x and of f(x). Let 
us put now x = 1 and we obtain f”(1), etc. 

In our above discussion we supposed that our equation can be 
uniquely solved with respect to y. But this is not always true. For 
instance, if x? + y? = 1, then y(r) = +V1 — z?, so that in any neigh- 
borhood of (1,0) we cannot find a UNIQUE function y = y(x) such 
that x? + y(x)? = 1. Hence, we cannot compute y’(1), y’(1), etc. This 
is why we need a mathematical result to precisely say when we have or 
not such a unique "implicit" function. 


f"(x) 


THEOREM 81. ( (1 < 1) Implicit Function Theorem) Let A be an 
open subset of R? and let F : A — R be a function of two variables 
which verifies the following properties at a fixed point (a,b) of A: 

i) F is a function of class C' on A. 

ii) F(a, b) = 0, i.e. (a,b) is a solution of the equation F(x, y) = 0. 

itt) F (a,b) £0. 

y 

Then there is a neighborhood U of a, a neighborhood V of b with 
UxV CA and a unique function f : U — V such that: 

1) F(a, f(x)) =0 for all x in U. 

2) f(a) = 6b. 

3) f is of class C' on U and 


for all x in U. 
PROOF. We construct an auxiliary function 
® =(P1, 2) : A ~e R’, P(x, y) = (DF (2, y)) 


for all (x, y) in A. Thus, vy, (2, y) = x and y,(z, y) = F(z, y). We are to 
apply the Local Inversion Theorem to this function ®. Let us compute 
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the Jacobi matrix of ® at (a,b) : 


F =( 1 0 ) 
(a. = | 2E(a,b) 2F(a,b)) 


Since ®(a,b) = (a,0) and since det Jiay),e = (a; b) 4 0, Local In- 
version Theorem 80 says that there is an open neighborhood U x V of 
(a,b) and an open neighborhood U x W of (a,0) (why can we take the 
same U?) such that the restriction ® |,,,,.: U x V — U x W of © to 
U x V is a diffeomorphism. Let VY = (w,,%.): Ux W —-U x V the 
inverse of this diffeomorphism. Let us define f(x) = W2(x,0) for any x 
in U. It is clear that f : U — V is of class C1 on U, f(a) = b and for 


any x of U we have 
(x, 0) a &[ W(x, 0)| — (1) (x, 0), Ho(z, 0)] 


= O(n, f(x)] = (w, F(a, f(x), 
ie. F(x, f(x)) = 0, for any x in U. The function f : U — V is of 
class Ct on U because (X,Y) has continuous partial derivative with 
respect to X at any point of the form (x,0) for any x in U. Let us 
differentiate totally with respect to x (this means that x is considered 
not only like "the first" partial free variable of F(x, y), but even as an 
implicit hidden variable in y = f(x)) the relation F(x, f(x)) =0: 


OF OF ; 
0= Jp (te 4 (2) + op Lay) F a); 


thus fe 
Pe AGHIC) 
BE (x, fa) 
for any x in U. Since det J(z,),6 #0 on U x V (why?) we get from 


ee 0 ) 
eee NEE ey) <a ea) 


that aee f(x)) £0 for any zx in U. 

If g was another function defined on an open neighborhood U, of a, 
which verifies the conditions 1), 2) and 3) then, on the neighborhood 
Uy =U NU, we would have 


Po(a, F(a, g(x)) = g(x) 
for any x in Up, or W2(x,0) = g(x) = f(x) for any x in U2. Hence, the 
uniqueness reefers to another smaller neighborhood of U on which f 
and g are equal. In some conditions, this uniqueness can be extended 
to the whole initial U or even to the whole pr,(A), the projection of A 
on the Ox-axis. 
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Let us consider again the implicit equation 
x + 2y* — 3ry = 0 


and let us study it around the solution (1,1). Since or (1, T= 8 0, 
the (1-1) Implicit Function Theorem says that there is a neighborhood 
U of x = 1, a neighborhood V of y = 1 and a function f : U — V, 
of class C! on U, such that the points {(x, f(x)) : c € U} are on the 
plane curve x? + 2y? — 3ry = 0, ie. x? + 2f(x)? — 3xf(x) = 0 for 
any x in U. Now, if we are sure on the existence of such a f, we can 
use different approximation methods to compute it (approximately!). 
The worst situation is when the conditions of the Implicit Function 
Theorem fail and we try to compute y = f(x) approximately! Usually, 
in this last case one has more then one function y = f(x) which ver- 
ify our equation and during our approximate process we "jump" from 
one "branch" to another one, the obtained values for ” f(x)” having 
a chaotic behavior. For instance, around the point (1,0), the implicit 
solution of the equation 7?+y? = 1 with respect to y has two branches: 
y = V1—<2? and y = —V1—-2?. This is because o (1, 0) = 0 and the 
Implicit Function Theorem fails around the point (1, 0). 

There are two directions for generalizations of this basic theorem. 
One reefers to increase the number of variables and the other to consider 
vector fields relations, i.e. a system of implicit equations. We do not 
prove these generalizations because these proofs do not contain new 
ideas and the "many" variables notation are too sophisticated. 


THEOREM 82. ((n < 1) Implicit Function Theorem) Let A be an 
open subset of R"*+, let (a,b) = (a1, a2, ...,n,b) be a point of A and let 
F:A>R, F(21,2%2,...,0n,y ) be a function of n+ 1 variables which 
verifies the following conditions: 

i) F is of class C1 on A, i.e. it has continuous partial derivatives 
with respect to each of its n +1 variable. 

ii) F(a,b) = 0. 

iti) SE (a,b) #0. 

Then there is a neighborhood U of a, a neighborhood V_ of b such 
that Ux V CA and a unique function f :U — V such that: 

1) F|x,f(x)] =0 for all x in U. 


2) f(a) =. 

3) f is of class C' on U and 
OF Be FO) 
On IF x, Fe)’ 


for any x in U. 
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For a proof see [FS]. Let us take the following equation: 
9g? ey? 422? = bays = 0 


and its solution M(1,1,1) (prove this!). Since 4(1,1,1) = 1 4 0, 
one can apply the last theorem and can write z = z(x,y) around the 
point (1,1). Let us compute Fe (1, 1). The most practical way is to 
put z = z(x,y) into our equation: 


23 + y? + 22%; y)? — dryz(z,y) =0 


and let us differentiate this with respect to x and to y : 


Oz Oz 
Os 2 _ _ _ 
62? + 62(r, 9)? (or,y) — Sy2(,y) ~ Sys (ey) =O, 


Oz Oz 
a 20 — — —— —_ 
3y° + 62(2, y) Bi (x,y) — 5az(z, y) ons, (x,y) = 0. 


From these equations we compute 


Oz 6x7 —5yz Oz By? — baz 
22 — = — 
(2.2) Aq tr Y) Say 2622’ By (x,y) Bay 622 
Now, 
(2.3) Oz  O (By? —5axz(z,y)\ _ 
; OxOy Ox \Say — 6z(2,y)2) 


(—5z- 5a) (5acy — 627) — (3y? — daz) (By — 12222) 
Say — 627)? 


We need to compute 2 (1, 
(1,1) = —1 (because z(1 
find (1,1) = 34. 

We consider now many relations, i.e. instead of the scalar function 


F we take a vector function F = (F\, Fo,..., Fm) : A > R™, where A is 
an open subset in R"*”. 


( 
1), so we must use formula (2.2) and find 
1) = 1). Come back to formula (2.3) and 


’ 


THEOREM 83. Let A be an open subset of R"*™ and let 
(a, b) =(a1, AQ, ++», An; by, bo, aera5 bn) 


be a point in A. Let F = (F), Fo,..., Pm): A > R” be a function which 
verifies the following conditions: 
i) F is a function of class C1 on A. 


2. IMPLICIT FUNCTIONS 209 
ii) F(a,b) =0, i.e. 
Fy(a1, a2, ---An; bi, bo, seey bm) = 0 


Fin(ai, AQ, ---An; by, bo, seey Dis.) = 0 


iii) For F(x, y) = F(21, £o, ...,Lnj Yt; Yo, ++) Ym), we define the Jaco- 
bian matrix relative to y = (Y1, Y2,---;Ym) only, as follows: 


ley): 2-4 ee ey) 
Jy F(X, y) = 
PO). £20 mG OGY) 


The condition is that det Jy w(a,b) 40. This last determinant can be 
suggestively denoted by 
Dis Foye Fn) 
Dy, Yds very Ym) 
Then there is a neighborhood U = U, x Uz, x ... X Un of a = 
(a1, G2, ...,Qn), a neighborhood V = V,xV2x...XVm of b = (b1, ba, ..., 0m); 
such that U x V C A and a unique function f = (fi, fo,.-., fim), 
f,:U 3 V;,71=1,2,...,m, with the following properties: 
1) F(x, f(x)) =0 for any x in U. 


det Jy x(a, b) = 


2) f(a) =b. 
3) f is of class C! on U and 
D(F,,F2,...,Fm) 
(2 4) Of; (x) =f DGEOr. i) Serio (x, f(x)) 
ae = D(F1,F2,..., en) f( )) 
: D(y1 925m) rt 


It is not necessarily to memorize this last cumbersome formula as 
we can see in the following example. 

Let (C) : 22 + y? — 2? = 0 bea conic surface and let (FE) : 2? + 
2y? + 327-4 = 0 be an ellipsoid. Let y = (C)N(£) be the intersection 
curve of them. We see that the point M(1,0,1) is on this curve. The 
question is if we can find a parametrization of the form 


x= x(y) 
y: yoo, 
z= 2(y) 


i.e. if we can use y as a parameter for this curve in a neighborhood 
of M. This is equivalent to see if the following system of the implicit 
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functions « = x(y) and z = z(y) can be solved around M : 


{ Fy(y;2,2) = 2° +4? — 27 =0, 


(2.5) Fo(y; 2, z) = x? + 2y? +322 -4=0. 


Since all our functions are elementary ones, we need only to check the 
condition iz) of the theorem: 


D(F,, F) a 2F1 (1,0, 1) a ree 8 = 
Digzy P= (2800.1) 22 0,0,)) > oe 


So, x and z can be seen like functions of y in a neighborhood of M. 
Let us compute the "velocity" and the "acceleration" at MM, along the 
curve y. For this, it is not necessarily to use the formula (2.4). Namely, 
let us put in (2.5) instead of x, x(y) and instead of z, z(y) : 


x(y)’ + y? — z(y)? = 0, 
r(y)? + 2y? + 32z(y)? -4=0. 


Let us differentiate both equations with respect to the ONLY free vari- 
able y : 


2x(y)a'(y) + 4y + 62z(y)z'(y) = 0. 


This is an algebraic linear system in the variables x’(y) and z'(y). Solv- 
ing it, we get 


{ 2x(y)a"(y) + 2y — 2z(y)z'(y) = 0, 


(2.6) oy) =- GoW) =- Za 


To find x” (y) and z"(y) we differentiate again in the formulas (2.6) and 
get: 


fi 5a(y) —ya'(y) _» 1 z(y) — y2'(y) 
PO Tay aay 


Now, it is easy to find «’(0) = 0, z'(0) = 0, #”(0) = —3 and 2”(0) = —3. 
Here is an example when the velocity is zero at a point M but the 
acceleration is not zero at the same point. Thus, one has a nonzero 
force at a stationary point! 


(2.7) 


3. Functional dependence 


Let A be an open subset of R” and let f, fo,..., fm be m functions 
defined on A with real values. We assume that each f; is of class C! 


on A. 
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DEFINITION 35. We say that {f1, fo,..., fm} are functional depen- 
dent on A if one of them, say fm is "a function" of the others 


fi, fa, Large aes 


i.e. there is a function $(y1, Y2,---;Ym—1) of m—1 variables, of class 
Ct on R™ |, such that 


f(x) = PLAC); 208), fina); 


for any x in A. 
For instance, 
(3.1) fi(v1,%2,%3) = 21 +224 23, fo(@1, 12,13) = 1X2 +2143 + L2%3, 


f(1, 02, %3) = a + 25+ 23 
are functional dependent because f3 = f? — 2fo. Thus, o(y1,y2) = 
Yt — 2ye. 
We know from Linear Algebra that 1, fo, ..., fm are linear depen- 
dent if there are \j, Ag, ..., Am scalars, not all zero, such that 


(3.2) Afi t+ Aofo +. + Amfm = 9, 


ie. Ai fi(x) + A2fo(x) +... +Amfm(X) = 0 for any x in A. Assume that 
Am # 0, divide the equality (3.2) by \,, and compute fin: 


fn = Sf fa = Ent 
Hence, fi, fo, ..., fm are also functional dependent. Conversely it is not 
true. For instance, the functions fi, fo, f3 from (3.1) are functional 
dependent but they are not linear dependent (prove it!). This shows 
that the notion of functional dependence from Analysis is more general 
then the notion of linear dependence from Linear Algebra. 


THEOREM 84. Let A be an open subset of R” and let fi, fo,..., fm : 
A — R be m function of class C' on A. If {fi, fo, -s fm} are func- 
tional dependent on A, then the rank of the Jacobian matrix of f = 
(fi, fo, > fm) : A— R™ is less than m. 


PROOF. Suppose that fin(x) = @[f1(x), fo(x), ..., fm—1(x)] for all x 
in A. Then, 
Ol _ 0% Of, | 96 Ofr 
OL; On Ox; : OY Ox; 
for all 7 = 1,2,...,n. This means that the m-th row of the matrix J ¢ is 


a linear combination of the first m— 1 rows, so the rank of the Jacobian 
matrix J, ¢ is less than m (why?-see any Linear Algebra course). 


\ | Oo O fmt 
iis OU 4 Ox; 
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We say that f1, fo,..., fm are dependent at a, a point in A, if there 
is a neighborhood U of a, U C A, such that fy, fo, ..., fin are dependent 
on U. If fi, fo,..., fm are not dependent at a, we say that they are 
independent at a. If f1, fo,..., fm are independent at any point of A, we 
say that f1, fo,..., fm are independent on A. 


THEOREM 85. If the rank of Jx¢ is equal tom for any x in A, then 
fi, fo, 5 fm are independent on A. 


PROOF. Suppose contrary, namely that there is a point a in A and 
a small neighborhood U of a, such that f), fo, ..., fm are dependent on 
U. Applying Theorem 84 we get that the rank of Ja is less than m. A 
contradiction! Thus, fi, fo, ..., fm are independent on A. 


We also have a reverse of the last two theorems. 


THEOREM 86. With the above notation and hypotheses, ifm <n, if 
f = (fi, fo, --, fm) is of class Ct on A and if for a fixed point a of A one 
has that the rank of Ja¢ is less than m, then there is a neighborhood U of 
a, U CA, and s functions from { fi, fo,..., fm}, say fi, fo, -.., fs, which 
are independent on U, such that the other functions { fs4i, fs+2, ++) fm} 
are functional dependent on fi, fo,..., fs on U. This means that there 
arem — s functions ¢,, 9, --; Pm» of class C' on R*® such that 


fsa) = b1 (fi), +5 fs(%))s +5 Fn) = Pim—s( Sa), ++ F(X) 


for all x in U. 


The proof involves some more sophisticated tools and we send the 
interested reader to [Pal] or [FS]. Let us apply this last theorem in a 
more complicated example. Let 


fi = 2143+ L2%4 

fo = 21 L4 — L9X3 
fg =a +23 -— 23-27 
fa=ajitupt+a+ay 


be four functions of variables 71, 22,273,714. The Jacobian matrix of 
f= (fi, fo, fs, fa) at a= (1, 1,0, 0) is 


00 1 1 
Oat 2 
Jet '|5-9° go 
FD 


Since the rank of this matrix is 3 and a nonzero 3 x 3 determinant 
involves the first 3 rows, one sees that f;, fo, fz; are functional inde- 
pendent at a and f, is a function of the others in a neighborhood of a. 
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If we look carefully, we see that f? = Afi + f3) + f8,80 fi, fo, fa, fa 
are functional dependent on the whole R*. 


A. Conditional extremum points 


Sometimes we have to find the extremum points for a function f 
defined on a compact subset C' of IR”. For instance, let C’ be the closed 
ball 


BO, 3] = {(,y,2) :a* ty? +27 < 9}, 
centered at 0 = (0,0,0) and of radius 3. The problem of finding the 
extremum points of the function f(x,y, z) = x +2y 4+ 32 defined on C 
can be divided into two parts. First of all we find the local extrema 
points of f defined only on the open set 
B(0,3) = {(a,y,2) 2 +y? +2? < 9} 
by using Fermat’s theorem, then we consider only the points on the 
sphere x? + y? +z? = 9 and try to find the extremum points M(z, y, z) 
of f, which verify this last supplementary condition (a constraint). This 
last problem is an example of a conditional extremum points problem. 
The general method for solving such problems is the "method of 
Lagrange’s multipliers". In the following we shall describe this method. 
Let A be an open subset of R” and let f, 91, 92,-.-;dm (m <n) be 
functions of class C1! on A. We assume that 91, g2, ..., gm are functional 
independent on A, particularly, if g = (91, g2,---,9m), its Jacobian 
matrix Jy has the rank m at any point x of A. Let S C A be the set 
of all solutions (in A) of the following system of equations: 


Obie biyca es) — 0 
(4.1) 


Op lap ios a) = 0 


These equations are called constraints or supplementary conditions for 
the variables 11, Xo, ..., Ln. 


DEFINITION 36. We say that a point a = (a1, 9,...,dn) of S is 
a local conditional maximum point for f with the constraints (4.1) if 
there is a neighborhood U of a, U C A, such that f(x) < f(a) for any 
x inUNS. The notion of a local conditional minimum point with the 
same constraints, for the same function f, can be defined in the same 
manner. 


For instance, (0,0) is a local conditional minimum for f(z,y) = 
x” +y defined on R with the constraint y — x? = 0. Indeed, f(z, x?) = 
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2x? > 0 = f(0,0) for any x € R. But (0,0) is not a local extremum 
point for f. 

Let A = (Aj, A2,.--;Am) be a variable vector in R™. These new 
auxiliary variables 1, A2,..., Am are called Lagrange’s multipliers and 
the new auxiliary function 


(4.2) ®(21,02,...,2n3 At, A2, --s Am) = O(%, A) = f(x) +S > Ajgi(x) 
j=l 


is called Lagrange’s associated function. 


THEOREM 87. (Lagrange’s Theorem) Let us preserve all the above 
notation and hypotheses. Assume that a is a local conditional extremum 
point for f, with the constraints (4.1). Then there is a vector X* = 
(AT, A955 Am) in R™ such that the point 


(a, X”*) -_ (a1, G2, +++) An; i; X53; a) Amn) 


is a critical (stationary) point for Lagrange’s function ®, i.e. 
grad® (a, A*) = 0. 


PROOF. (for n = 2 and m = 1) Suppose that a is a local conditional 
maximum point for f. Since g = g; is functional independent, it cannot 
be a constant function, say JL (a) # 0. We can apply the Implicit 
Function Theorem and find a function h : U; — U2 of class C! on Uj, 
an appropriate neighborhood of a, (U2 is a neighborhood of a2), such 
that h(a1) = a2, g(a1,h(x1)) = 0 for all x, in U, and 


22 (a1, h(21)) 


~ 22 (ay, A(a1)) 


0x2 


for all 2, in Uj. We can assume that the neighborhood of a, U = U; x U2 
is sufficiently small such that f(x) < f(a) for any x in U. We define 
now a new function D : U; — R, D(a) = f(#1,h(21)) for any x, in 
U,. Since D(x1) < D(a,), for all x; in Uj, we see that a, is a local 
maximum point for the function D. Use now Fermat’s Theorem and 
find that D’(a,) = 0, or that 


Of Of Fis i 
Dn, - Eee -h'(ay) = 0. 
Thus, 
Of 
(4.4) ojala 
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But the same h'(a;) can also be computed from the formula (4.3) 


22 (ay, az) 
gL (a, a) 


h'(a1) == 


If we equals the both — of h'(a1) we get 


Of of 2 
Eres ze - Ox, a) (a ae 


Let us put 
« def i (a) _ ae (a) 
(4.5) MN = 3g = a 
5x, (a) day (8) 


and let us write the Lagrange’s auxiliary function for this "multiplier" 
xs 

B(x,\*) = f(x) + g(x): 
Let us compute the grad®(a,A*) by taking count of the value of \* from 
(4.5): 


2 (ar) = 2, (2 a) + N9@ (8) = 
28 (a,\*) = Z£(a) + \* Z£(a) =0 
a (a,A*) = g(a) = 0, because a € S. 


Hence grad®(a,X*) = 0 and the proof is complete. 


Look now at the function 
®(x, A*) )+ B digi (x 


where A* = (Aj, Aj, ...,A;,,) is the vector just constructed in Theorem 
87. It is easy to see that a is a local conditional maximum (for instance!) 
for f if and only if a is an usual local maximum for the function T(x) = 
®(x, A*). Thus, if we want do decide if a stationary point (a, A*) of the 
Lagrange function is a conditional extremum point, we must consider 
the second differential of T at a. But, in the expression of d?T'(a) we 
must take count of the connections between dx, dxo,...,dx%,. These 
connections can be found by differentiating the equations 4.1: 


ona \day+..+ t a (@ tO 


Gam (a)day + .. + te (a 0 
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Since the rank of the Jacobi matrix Jag is m <n, this linear system 
in the unknown quantities dx,, dr2,...,dv,, has an infinite number of 
solutions. Namely, say that the last n — m unknowns d%41,...,d¢n 
remain free and the others dx,, dx2,...,d%m can be linearly expressed 
as functions of the last n—m. Thus, the differential d?@(a, A*) becomes 
a quadratic form in n—m free variables. The sign of this last one must 
be considered in any discussion about the nature of the point a. 

Let us find the points of the compact 2? + y? < 1 in which the 
function f(x,y) = (~—1)?+(y—2)? has the maximum and the minimum 
values. Let us find firstly the local extrema inside the disc: x? +y? < 1. 


OF Of _ 
Ox Oy 
So the critical point is M@(1,2). But this point is outside the disk, thus 


M (1,2) is not a local extremum point of f. 
Let us consider now the local conditional problem: 


2(a — 1) =0 2(y — 2) =0. 


max(min) f 
with the restriction 
g(z,y) =2°+y—1=0 
The auxiliary Lagrange’s function is 
®(x,y,r) = f(x,y) + A(e? + y? — 1). 
Let us find its critical points: 
o® — Wr —1)+2\2 =0 


38 _ 9(y — 2) + 2dy =0 


aa t+y—-1=0 


Solve this system and find 7 = xa and y = sa (why A cannot be 


—12), y= V5-1a =, = % and yp = —V5 -1, w = —Z 
<i ae Let us denote Mi(<z, =) and M2(—<z, oe In order to 
see the nature of these critical points, let us find the expression of the 


second differential of ®(x, y, \) for a constant parameter ». We find 
a? O(x, y, X) = (2 + 2dr)dx* + (2 + 2A)dy’. 
Since xdx + ydy = 0, then dy = —<fdx, sO, 


2 


d?B(x,y, \) = (2+ 2A)(1+ aie 


For \; = V5 — 1, we get that M, is a local conditional minimum. For 
Ay = —V5—1, we obtain that Mp, is a local conditional maximum. 
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Hence, the global maximum of f on the compact subset {(2, y) : 2? + 
ae Ses en (-;, -4;) = 6+ 3v2. Its global minimum is 6 — 3V2. 

Let us consider now a practical problem of conditional extremum. 
Let us find the distance between the line x — y = 5 and the parabola 
y = x*. Let L(x1,y,) be a running point on the line and let P(22, y2) 
be a running point on the parabola. The square f (71,22, y1,y2) = 
(v1 — %2)? + (yi — ye)? of the distance between two such points must 
be minimum and the constraints are 


gi(X1, £2, Y1, Y2) =%f1-—4i,.—5 =0 
and 
g2(%1, £2, 91, Y2) = v5 — y2 = 0. 


The Lagrange’s function is 


O(11, £2, Y1, Y2; At, Az) ae (x4 cad r2)” oh (yi = yo)? as 
+A3(@1 — yr — 5) + Ao(x5 — yo). 


If we solve the 4 x 4 algebraic system grad® = O, we get 7, = = 
19 


Yl = —g V2 = 5, Y2 = 7 and the corresponding distance is 7 Ja" 


5. Change of variables 


What is the plane curve xy = 2? We know that an equation of the 


form “ — ve = 1 is a hyperbola. If we introduce two new variables X 


and Y such that 7 = ax — aay and y = x + ay, we introduce in 
fact a new cartesian coordinate system XOY which is obtained from 


xOy by a rotation of 45° in the direct sense (see Fig.10.2). 
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Fig. 10.2 


Our initial curve ry = 2 becomes X? — Y? = 4, i.e. we have an 
usual hyperbola with a = b = 2 relative to the new cartesian coordinate 
system XOY. 

The moral is that sometimes is better to change the old cartesian 
coordinate system i.e. to change the old variables 71, 22,...,%,) with 
another new ones ¥, ¥/2, ---; Yn Which are functions of the first ones: 


n= Yi (£1, Lo, sell) 
(5.1) 


Yn = Yn(@1, 2, seey Ey) 


Here we forced the notation. The function of n variables which defines 
the new variable y; is also denoted by yy, etc. 


DEFINITION 37. Let D,Q be two open subsets of R” and let f : D — 
Q. be a diffeomorphism of class C* on D, i.e. £ is a bijection, it is of 
class C® on D and its inverse £~! is also of class C* on Q. Usually, 
k =1 or2. We call such af a change of variables of class C*. 


If we write 


FG Figs Wig CTW Hs ck eras Ua (CE gD oo) Is 


we have a representation like (5.1) for the vector function f. We also call 
such a representation a change of variables. We represent the inverse 
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of f by: 
t= (41, Y2; on) 


(5.2) 


In = Ln(Y1; Y2s vey Yn) 


In fact, we solved the system (5.1) and we computed 21, %q,..., Zp as 
functions of y1, Yo, -.-; Yn- For instance, if y, = 1 +22 and yo = 2%,—4o, 
then 7, = a(Y1 + yg) and x = 5 (2Y1 — Y2). 

If one considers an expression like 


Og Og 


Ox,’ Ox;0x,7 7° 
j OX; 


the problem is to find an appropriate change of variables of the form 
(5.2) such that the new expression in the new variables yj, ya, ..., Yn has 
a simpler form. Thus, the "old" function g(x1,22,...,0%,) becomes a 
"new" function 9(y1, Y2,---; Yn). The relations between these two func- 
tions are 


ENG Wogan OB tee een) 


(5.3) Gm, Yas very Yn) = g(ri(y1, Yo; sates Aes Ln(Y1, Yo; ag Un) 


and 


(5.4) OES) 8, Ce) = Ol WP Wa yds Oy li coos Ua Dg og hey) 
Now, the problem is to express the partial derivatives 


ae jee ) 

Se (aR es eee C7 Fs ee op anes en 

On; als Ox OX; ies 

only in language of the partial derivatives of the new function 
G(Y1; Y2; ++: Yn). This is an easy job if we know to manipulate the 

chain rules. For instance, if x = (11, 2%2,...,%n) and y = (Y1, Y2,---; Yn); 

from (5.4) one has: 

Og 


_ 


OY yy 4 4 OG py ny 
By (x) +. (y) (x), 


Ox; "OY n Ox; 


i = 1,2,...,n. To have "everything" in y1, yo,..., Yn, we finally put in- 
stead of £1, 21(Y1, Yo, ---) Yn), ++, Instead of Fy, La(Y1, Yo, «++; Yn): 
For instance, let us make the substitution (change of variables) 

x = exp(t) in the following Euler’s equation: 

dy dy 

2 
tp a = 0.8 =). 
da? dx 
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First of all recall the differential notation: y = y(x), y'(x) = & (since 
2 


dy = y'(x)dz) and y"(x) = £4 (since d?y = y"(x)dx?-see the formula 
for the second differential!). Let us denote by y(t) = y(exp(t)). Since 
y(x) = y(Inz), one has that 

dy dy dt dy 1. d_d 

oe a de ea 
Let us compute 


dy d (dy\ da (ay d (ay 
dx? da @ ~ dx (Z -exp(—t)) ~ dt (F -exp(—t))-exp(-t) 


Applying the rule of the differential of a product, we get: 


d? d? d 
= = -exp(—2f). 
dx? (a =) ae 
Substituting in the initial equation, we get ty =Q,ie. Y= Cit+ Cy, 
where C1, C2 are arbitrary constants. Thus, y(x) = Ci Inz + C2 and 
we just found the general solution of the initial differential equation. 


6. The Laplacian in polar coordinates 


The polar coordinates p,@ were introduced in Example 18. The 
"linear operator" A, the Laplacian, carries functions u(x, y) of class 
C?, defined on a fixed domain D C R? into continuous functions: 


For instance, in order to solve the famous Laplace equation, Au = 0, 
which appears in many applications, we sometimes need to write the 
operator A in polar coordinates p and 6. We know that 

x = pcosé 

{ y= pang 

where p € (0,00) and 6 € [0, 27). The Jacobian of this transformation 
is det Jip), = p # 0, where g(p,0) = (pcos@, psin@). Let us denote 
by u(p, 0) = u(pcos 6, psin@), the new function in the new variables p 
and 0. Let us denote by p = p(x, y) and by 0 = 6(x,y) the coordinates 
of the inverse function g~!. Thus, 


u(x, y) = U(p(x, y), A(z, y)). 


Hence, 
Ou __ Ou OP Ou 00 
Ox” — Op Ox 00 Ox 
(6.1) Ou _ Ou Op Ou 00 
Oy Op Oy 00 Oy 


7. A PROOF FOR THE LOCAL INVERSION THEOREM 221 
These last relations can be represented in a matrix form 
du dp a0\ / au 
o © )@ 
Oy Oy Oy 06 
Since go g-' = the identity mapping, we have that 


ae oe ‘ae -1__ (cos? —psiné oe cos@ sin@ 
ee = Yom) = sin@ pcosé ih seen eons) 


p p 


Let us come back to formula 6.2 and find: 


oe cos@ —n@\ (ou 
ga = en pp 
Cu ; Ou 
Dy sin 0 = Aa 


Let us write this formula in a nonmatriceal form: 
{ Oth = OY ppg fis Suan 


a Sa 
060 p 
Let us use now these formulas and the chain rules formulas 2.7, 2.8 to 


2 2 
compute Au = a + os : 


(6.3) 


Pu Ou ae OU sinOcos# Ousin?@ Ousin?O  , JUsinO cos 
Ox2 Ap? Opod—p 00? p? Op p00 Pp? 
Ou Ou Li | PU sindcosO  O*ticos? 6 JUcos*O Ow sin A cos 


Oy? Op? Opodo p 00? p? Opp 258 p? 
Hence, the formula for the Laplacian in polar coordinates is: 

Ot. LO, Lou 

«Op? p86? © pdp’ 

This formula will be used later in the course of partial differential equa- 
tions with direct applications in Engineering. 


Au 


7. A proof for the Local Inversion Theorem 


Here we present a complete proof for the Local Inversion Theorem 
(see Theorem 80). We prefer an elementary longer proof then a shorter 
sophisticated one. Let us state again this basic result. 


THEOREM 88. Let A be an open subset of R" and let f : A — R” 
be a function of class C' on A. Let a be a point in A such that the 
Jacobian determinant det Ja 4 0. Then there are two open sets X C A 
and Y C f(A) and a uniquely determined function g with the following 
properties: 

i)a€ A and f(a) EY, 

apy =AX) 

iii) g: Y + X, g(Y) =X and g(f(x)) =x for any x in X, 


’ 
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iv) g is of class C' on Y and the restriction of f to X, f |x: X + Y 
is a diffeomorphism with g = (f |x)~!. Particularly, 


Drei = (hen) 


and 
1 


= det Treg 

PRoor. STEP 1. First of all let us remark that if (hij;(x)),7,7 = 
1,2,...,n are n? continuous functions defined on A, such that 

det|h;;(a)] A 0, then there is a small closed ball Bla, r] with centre 
at a and of radius r > 0, Bla,r] C A with the property that whenever 
we take n? points {x;;} in Bla,r], one has that det[h;;(x;;)] 4 0. In- 
deed, let us define a continuous function of n? variables on the product 
AxAx..XxA: 


n2—times 
D(Xu, X12, ane Xin; ace Xni, Xn2, man's Xan) = det [hij (Xi;)]. 
Since D(a,a,...,a) = det(h;;(a)) is not zero, say D(a,a,...,a) > 0, one 
can find a small ball B(a,r’) C A, r’ > 0, on which 
D(X11, X12) +--+) Xn) = det(hij(xij)) > 0 
for every x;; in B(a,r’) (see Theorem 57). If one takes any r,0 <r <1’, 
then det(h,;(x;:;)) > 0 for any arbitrary n? elements {x;;} in B[a,r]. In 
our case, det Jar = (act 5e(a)) # 0, where f = (fi, fo, ..-, fn). Hence, 
we can find a small closed ball W = Bla,r] Cc A, r > 0, on which 


det J(x) g 


(act an (xi;)) # 0 for any n? elements x;; in W. 

STEP 2. Let us prove now that the restriction of f to W is one-to- 
one. Suppose that x and z are in W such that f(x) = f(z). This means 
that for every i = 1,2,...,n one has that f;(x) = f,(z). Let us apply 
the Lagrange theorem (see Theorem 73) on the segment [x, z] : 


Of 
(7.1) O= filx ~ oh ” — 24), 
Jj 


where c is a point on the segment [x,z] and x =(11,%2,...,%n), Z = 
(21, 22, +++; Zn). Since the segment [x,z] is contained in W (why?), all 


ce, i = 1,2,...,n, are contained in W and so, det (24 (e)) # 0. 
Hence, the homogeneous linear system 


Oe Sra co) - (x; — 2), 
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i =1,2,...,n, in the unknowns 2, — 21, £2 — 22, ..., Zn — Zn, has only the 
trivial solution, i.e. ©) = 21,...,%pn = 2, or xX = z. Thus, f is one-to-one 
on W = Bia,r]. 

STEP 3. Let us prove now that the image f(Z) of Z = B(a,r), 
the interior of W, is an open subset of IR”. Indeed, let us define the 
continuous function g : OZ — R (here 0Z = W ~ Z is the boundary of 


Zi): 

g(x) = ||£(x) — f(a)||, 
for x € OZ. Since OZ is a compact subset of R"” (prove it!) and since 
f is one-to-one (see STEP 2), the minimum value m of g on OZ is > 0 
(why?). Let us denote by T = B(f(a),4) and let us prove that this 
open ball T is contained in f(Z). For this, let y be a fixed element in 
T and let us define the following continuous function: 


h(x) = ||f(x) — yl 


for any x in W. Let us see that the absolute minimum of h cannot be 
attained on the boundary OZ. Indeed, since 


h(a) = |I€(a) — yll < 5, 


one has that min h(x) < 4. But, if x € OZ, we have 
h(x) = |l€(@x) — yl] = |lf@<) — f(a) — [l€(a) — yl 
mm 
ee Se ea 
> 9X) — > 2a 
ie. h(x) > } for any x in OZ. Hence, let c be in Z such that 
h(c) = min{h(x) :x € W}. 


This c also realizes the absolute minimum for 
n 


h?(x) = |I€(x) — yl? = SOUF) — wl? 


r=1 


Then Fermat’s theorem says that: 


= { cs = wt = 2S [fr (x) — Yr - oh (x) 


psi 


is zero at C, 1.e. 


for every k = 1,2,...,n. This is again a homogenous linear system in 
the unknowns {f,(c) — y,}, with a nonzero determinant. Hence, we 
have only the trivial solution, i.e. f,(c) = y, for every r = 1,2,...,n. 
Thus, f(c) = y and so y € f(Z). But, the same type of reasoning can 
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be done for any other b = f(e), where e € Z and b€ f(Z). Namely, 
we take a sufficiently small open ball B(e,r”) C B(a,r) and we repeat 
the above reasoning for B(e,r”) instead of B(a,r). We find that 


m’ 


T' = Bib, oe Cc f(B(e,r”)) Cc £(Z) 
for the minimum ™’ of the function 
x — ||f(x) — f(e)||, 


defined on OB(e,r”). Hence, f(Z) is open in R”. Moreover, f carries an 
open subset X of Z into an open subset f(X) of R” (why?). 

STEP 4. Let now Y = B(f(a),r’) be an open ball centered at 
f(a) such that its closure B[f(a),r’] is included in f(Z) and let X = 
f-'(Y)OZ. It is clear that the restriction f |, : X — Y is a continuous 
bijection between X and Y. Let g: Y — X, g(y) =x be its inverse. 
Let X and Y be the topological closure of X and Y respectively. They 
both are compact subsets of R” and f |g: X — Y is also a bijection, 
because X C W and f is one-to-one on W (see STEP 1). Its inverse 
(f |x)’ : Y — X is continuous (because f is continuous and X and 
Y are compact sets...it reverses closed subsets into closed subsets!). 
Since the restriction of (f |x)~' to Y is exactly g (why?), g is also a 
continuous mapping and g(f(x)) = x for any x in X. 

STEP 5. It remains us to prove that g = (91, 92, ---; Jn) is of class 
C' on Y. We fix an r = 1,2,...,n and we shall prove that on exists 
at any fixed point y in Y and that they are continuous. Let e, = 
(0,0,...,0,1,0,...,0) be the r-th unit vector in R” (with 1 at the r-th 
position!) and let us consider the difference quotient: 


gly + te) — 9;(y) 

t ? 
where ¢ is a small real number such that y + te, € Y (Y is open). Let 
x = g(y) and x’ = g(y-+te,). Thus, 


f(x’) — f(x) = te, 


(7.2) 


implies that 
(7.3) i= 72 { 


Let us apply Lagrange’s theorem (see Theorem 73) for f; on the segment 
[x,x’] C Z. We get: 


0, ifiFr, 
(OS ig eae 


filx!) — Fils) _ 3 Ofi (a). vy ~ 2j 


(7.4) Oorl= ; 
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i = 1,2,...,n, where d® is a point on the segment [x,x’] C Z. Since 


det | z4(a®)| # 0, the linear system (7.4), in variables aoe has 
a unique solution (Cramer’s rule): 


sae = A; 
£ Ne 
j = 1,2,...,n, where A and A; are determinants with entries of the 
form 24(d®), 0, or 1. When t > 0, the determinant A > Jp 4 0 
J 


(why?), so 


Ai Az An\ O91 ( 09 29 
A’ A apa A OY, y ” Oy, y Be: OY, y ) 


i.e all the partial derivatives si (y) exist. Since their expressions in- 


Ofi 
Ox; 


the function g is of class C1! on Y and the proof of the Local Inversion 
Theorem is now complete. 


volve only partial derivatives of the type $4(x) which are continuous, 


The proof is long, but elementary and very natural. Trying to 
understand this proof one remembers many basic things from previous 
chapters. Moreover, the proof itself reflects some of the indescribable 
Beauty of Mathematical Analysis. 


8. The derivative of a function of a complex variable 


Let A be an open subset of the complex plane C. If we associate 
to any complex number z = x + iy of A, where x,y are real numbers 
and i = /—1 is a fixed root of the equation x? + 1 = 0, another 
complex number w = f(z), we say that the mapping z — f(z) is 
a function of a complex variable defined on A. Like in the case of a 
function of a real variable, we say that f has the limit L at the point 
2 = % + iyo of A if for any sequence {z,}, n = 1,2,..., of complex 
numbers Zz, = 2p + Yn, Zn, Yn € R, which tends to a, one has that 
f(én) — L. If L = f(zo) we say that f is continuous at 2. Let us 
assume that f(a + iy) = u(z,y) + iv(a,y), where u and v are two 
real functions of two variables. One calls u = Re f, the real part of f 
and v = Imf, the imaginary part of f. It is not difficult to see that 
f is continuous at zo = % + 7yo if and only if u and v are continuous 
at (Xo, Yo). Let us define the derivative of a function f of a complex 
variable z at a fixed point zp. We say that f is differentiable at 2 if 
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the following limit exists and is finite: 


(8.1) tim 22) = Fo) _ f' (20). 
Z— 20 z— 20 
We denoted its value by f’(zo) and we call it the derivative of f at z. 
For instance, (z?)’ = 2z, because 
2 


_ 2 —& : 
lim o = lim (2+ 29) = 2%. 
2420 2 — ZG Z—ZO 


Generally speaking, the usual differential rules of the functions of a real 
variable also works for functions of a complex variable. For instance, 


! fe pe) 
(f+ 9) = fitg', (af) = of, (fo) = fo + fo, (£) = ete, 
(fo g)'(z) = f(g) - 9(2), (sinz)’ = cos z, (exp(z))! = exp(z), etc. 
Many formulas in complex function theory (the theory of functions 


of a complex variable) can be easily proved by using the following 
fundamental result. 


THEOREM 89. (Identity Theorem) Let A be a subset of complex 
numbers with at least one limit point and let f and g be two differ- 
entiable complex functions defined on a complex domain B (it is open 
and connected) which contains A. Assume that f and g are equal at any 
point of A. Then f and g are identical, this means that f(z) = g(z) for 
all z of B. 


For a proof of this basic result see any book of complex function 
theory (see for instance [ST]). Let us use this result to compute the 
derivative of exp(z) = 77.4 = z € C. Let us denote by g(z) the 
derivative of exp(z). Since for any real number x one has that exp(x)’ = 
exp(x), we have that g(x) = exp(x) for any x in R. But all the point 
of R are limit points so, g(z) = exp(z). Here we tacitly used another 


basic result of complex function theory. 


THEOREM 90. If a complex function f : A — C, where A is a 
complex domain, is differentiable on A, then it has derivatives of any 
order on A, i.e. it is of class C®@ on A. 


Following an analogous theory like the Weierstrass theory for the 
real series of functions, we can prove that exp(z) is a differential func- 
tion. Hence, its derivative g(z) is also differentiable on C. This is why 
we could apply Theorem 89 for the complex function exp(z). 

What can we say about the two variables real functions u = Re f 
and v = Im f if f is differentiable at a point zo? 


THEOREM 91. (Cauchy-Riemann relations) If the function f(x + 
iy) = u(x, y)+iv(a, y) ts differentiable at a point zo = ro +tyo, then the 
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two variables real functions u and v have partial derivatives at (xo, Yo) 
and between them we have the following relations (the Cauchy-Riemann 
relations): 


6} 6} 6} 6} 
(8.2) 5, (0, Yo) = 5y (tort) By (080) = — (0, yo) 


Moreover, f'(zo) = $4(20, yo) + 132 (20; Yo) = 5Y (xo, Yo) = i5#(0, Yo): 


PRooF. If f is differentiable at the point z the following limit 


exists: 
Z—Z0 z— 20 


This means that for any sequence (2, Yn) which converges to (20, Yo) 
(in R?) one has that 


(8.3) 
li u(Zn, Yn) _ u(xo, Yo) + tans Yn) _ v(Xo, Yyo)| = f' Ca). 
Ln—X0Yn— Yo In — Xo + i(Yn — Yo) 
Firstly take here y, = yo for any n = 1, 2,.... We get 
Ou Ov 
(8.4) Dy tor Yo) an ia, (wo, Yo) = f(z). 
Secondly, let us consider in (8.3) %, = 2 for any n = 1,2,.... We find 
1 [Ou Ov 
(8.5) i Dy (20 Ho) qs 5, (o> wo) = f'(z0) 


Comparing (8.3) and (8.5) we get the Cauchy-Riemann relations (8.2). 


The Cauchy-Riemann relations imply that the real and the imagi- 
nary part of a differentiable complex function are harmonic functions, 
i.e. they are solutions of the Laplace equation: 


Ou Pu 
and “ : 
v Ofv 
Av = Age + ay? = 0 
(prove it!). 


Let f = u+iv be a complex function differentiable on a complex 
open subset A and let F(z, y) = (v(2, y), u(az, y)) be its associated field 


of plane forces. By definition, the curl (the rotational) of F is the 3-D 
vector field curl F =(0,0, ou - ae Since ou = oe on A, one sees that 


curl F = 0 i.e. the vector field F is irrotational. By definition, the 
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divergence of F is divF =o + ap But this last one is 0 because of the 
second Cauchy-Riemann relation. 

Moreover, if one know one of the two functions wu or v, one can 
determine the other up to a complex constant, such that the couple 
(u,v) be the real and the imaginary part respectively of a differentiable 
complex function f. Indeed, suppose we know u and we want to find v 
from the Cauchy-Riemann relations: 


(8.7) Seley) = Stay) 
and 
(8.8) Flu) = Se(0.0) 


From (8.7) we can write 


oy) = f Sale, ude + Clu). 


We prove that we can determine the unknown function C(y) up toa 
constant term. Let us come to the relation (8.8) with this last expres- 
sion of v. Here we use the famous Leibniz formula on the differential 
of an integral with a parameter (see the Integral calculus in any course 
of Analysis): 


Ou Ou ; 
ate =— | Saude +C'W) 


From (8.6) we find 


69) Pew = f Poeandr+ CW) = He.) + KW +CW), 


where C(y) and K(y) are functions of y. From (8.9) we get 
C'(y) = —K(y). 


Therefore, always one can find the function C(y), and so the function 
v(x, y) up to a real constant c. Hence, we can determine the function 
f =u-+tiv up to a purely imaginary constant ic. 

For instance, let us consider u(x, y) = 2? — y? and let us find f (if 
it is possible! It is, because u is a harmonic function!-this is the only 
thing we used above!). The Cauchy-Riemann relations become: 


Ov 

7g y) = 2y 
and 5 

(x,y) = 2a 


Oy 
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Let us integrate the first equality with respect to x 
u(x, y) = 2xy + Cy), 
where C(y) is a constant function with respect to x but,...it can depend 
on y! Come now to the second relation and find 
2x = 2x + C"(y), 

so, C’(y) = 0, i.e. C(y) does not depend on y. It is a pure constant c. 
Hence, v(x, y) = 2xy+cand f(z) = 2?—-y?+i(2Qry+c) = (x +iy)* +ic, 
where c is a real arbitrary constant. 

Let us now come back to formula (8.1) and consider an arbitrary 


smooth curve y which passes through zp. Let us take z very close to 2 
but on the curve y. So, we can approximate: 


f (2) = (20) aw f! (a6) 


zZ— 20 


(8.10) 


Hence, 


wo — 2o| |f'(20)| = 


2 
Ov 
|z — Z| “ate (rou) = [5 cow) . 


So, the length of the segment [f(zo), f(z)] is proportional to the length 
of the segment |Z, z]. The "dilation" coefficient 


i j [2.0] sid [2 ea.) 


does not depend on the curve on which z becomes closer and closer to 
ZQ- 

Let us recall that any complex number z can be uniquely written as: 
z = rexp(ia), where a € [0, 27). This angle a is called the argument 
of z. From the formula (8.10) we get 


(8.11) arg [f(z) — f(z0)] © arg(z — 20) + arg f"(zo). 

Here we assume that f’(zo) # 0. Formula (8.11) says that in a small 
neighborhood of z) our differentiable function preserve the angle be- 
tween two curves which pass through z) (why?). So, we can locally 
approximate the action of a differentiable function by a rotation of 
angle arg f’(zo), followed by a "dilation"(or a "contraction") of coeffi- 
cient | f’(zo)|. We assume that f’(zo) 4 0. Otherwise, the transforma- 
tion z — f(z) is almost constant around zo. A transformation of the 
complex plane into itself with this last two properties is called a con- 
formal transformation. These are very important in some engineering 
applications (hydraulics, fluid mechanics, electricity, etc.). 
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If we write the plane transformation z — f(z) as 
(x,y) > (u(z,y), v(z,y)), 
where f(z) = u+ iv, the Jacobian determinant of this at (29, yo) is 
gz (o> Yo) oy (70> Yo) = 
5u (£0; Yo) By (0 yo) 


Ou : Ov - ; 2 

Fe, wo) i Fe (o.w) = |f'(zo)I". 
Here we used again the Cauchy-Riemann relations. If we want that our 
transformation z — f(z) to be locally invertible around the point zo, 
we must assume that f’(zo) 4 0 (see the Local Inversion Theorem). In 
this last case, this transformation is locally a conformal transformation, 
i.e. it preserves the angles (with their directions) and it changes the 
lengthens with the same "velocity" around the point zo. 


9. Problems 
1. Find y’(x) if y = 1+y”. Why we cannot perform this computation 
for the points on the curve xy*~' = 1, y > 0? 
2 
2. Compute oe and £4, ify=a+lny,y 41. 


dx2 y) 


a: Lt = 24,4) and 
xg + 2 + 2° — 3ryz — 2y+3=0, 

find dz and d?z. 

4. Find inf f and sup f for: 

a) 

f(x,y) = 2? + 3xy? — 152 — 12y; 
b) 
f(x,y) = xy 

with r+y—1=0; 

c) 

f(ayzjaety te 

with ax + by + cz — 1 = 0 (What this means?); 

5. Find the distance from M(0,0,1) to the curve {fy = 27} N {z = 


. Compute the velocity and the acceleration on the circle 
{Pt+yt+2=a}n{ety+z2=a} 


by using a parametrization of the type: x = 2, y= y(2), z = 2(2). 
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8. Are the functions 
u=(e2+y+z)’,0 = 32 —yt3z,w = 2" + ay t+ yz +22 


independent at (0,0, 0)? 
9. Change the variables in the following expressions: 


0,u=xy,v 


2 

c) (aay -- (3) , © = pcos), y = psin#; 

10. Find all ® such that u = ®(a+ y) and v = ®(x)®(y) be 
dependent on R?. 

11. Prove that the following complex functions are differentiable 
and find their derivatives. Take a point z and study the geometrical 
behavior of the transformation z — f(z) around this point 2. 

a) f(z) = 32+ 2; b) f(z) = 2iz + 3; c) f(z) = 3, lal > 1; 

d) f(z) =exp(iz); e) f(z) = 2° +2, 2 #0; g) f(z) = zsinz; 


Al 
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